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Abstract 

Recent developments in the mathematical foundations of quantum mechanics have brought 
the theory closer to that of classical probability and statistics. On the other hand, the unique 
character of quantum physics sets many of the questions addressed apart from those met 
classically in stochastics. Furthermore, concurrent advances in experimental techniques and 
in the theory of quantum computation have led to a strong interest in questions of quantum 
information, in particular in the sense of the amount of information about unknown parame- 
ters in given observational data or accessible through various possible types of measurements. 
This scenery is outlined. (A shorter version of the paper, omitting some topics but otherwise 
much improved, is available as quant -ph/0307191). 



*MaPhySto is the Centre for Mathematical Physics and Stochastics, funded by the Danish National Research 
Foundation, University of Aarhus, Denmark. 
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1 Introduction 



In the last two decades, developments of an axiomatic type in the mathematical foundations of 
quantum mechanics have brought the theory closer to that of classical probability and statistics. 
On the other hand, the unique character of quantum physics (we use the terms 'quantum me- 
chanics' and 'quantum physics' synonymously) sets many of the questions addressed apart from 
those met classically in stochastics. The key mathematical notion is that of a quantum instrument, 
which we shall describe in Section [3 and which, for arbitrary quantum experiments, specifies the 
joint probability distribution of the observational outcome of the experiment together with the 
state of the physical system after the experiment. Concurrently with these theoretical develop- 
ments, major advances in experimental techniques have opened many possibilities for studying 
small quantum systems and this has led to considerable current interest in a range of questions 
that in essence belong to statistical inference and are concerned with the amount of information 
about unknown parameters in given observational data or accessible through various possible types 
of measurements. In quantum physics, the realm of possible experiments is specified mathemat- 
ically, and noncommutativity between experiments plays a key role. Separate measurements on 
independent and separate systems result in independent observations, as in classical stochastics. 
However, joint measurements allow for major increases in statistical information. 

The present paper outlines some of these developments and contains suggestions for additional 
readings and further work. We make some new contributions to the theory of quantum statistical 
inference, in particular, developing new notions of quantum sufficiency and exhaustivity. We 
give complete but short proofs of the quantum information (Cramer-Rao) bound and some of its 
consequences, filling some gaps in the proofs in the physics literature. The paper does not contain 
practical examples in the sense of real data analyses, for several reasons. For one thing, the 
realistic modelling of present-day laboratory experiments in this field involves several more layers 
of complexity (technical, not conceptual) on top of the picture presented here. The closest we come 
to real data is in our discussion of quantum tomography in Section 17.21 For another thing, the 
theory in this paper is largely concerned with the design rather than the analysis of experiments in 
quantum physics, and there is still a gap between what is theoretically possible under the laws of 
quantum mechanics, and what is practically possible in the laboratory, though this gap is closing 
fast. 'Information' is understood throughout in the sense it has in mathematical statistics. We 
do not discuss quantum information theory in the sense of optimal coding and transmission of 
messages through quan tum communi cation channels, nor in the more general sense of quantum 
information processing I CreenL l200fl() . Within quant um statist i cs, we con centrate on th e topics 
of estimation and of inference. The classic books of iHelstromI ijlQTfJl and iHolevol l(l982l) are on 
the other hand largely de vote d to a decision theore t ic app roach to hypothesis testing problems. 
See IParthasarathvl l)l999|) and lOeawa and Naeaokal l)200(]|) for recent contributions to this field. 
Confusingly, the phrase 'maximum likelihood estimator' has an unorthodo x meaning in the older 
litera ture. In many papers of which we just mention a few recent ones, iBelavkinI lll994L 1200(1 
1200 lh develops a continuous time Bayesian filtering approach to estimation and control. 

It should be emphasised from the start, that we see quantum mechanics as describing classi- 
cal probability models for the outcomes of laboratory experiments, or indeed, for the real world 
outcomes of any interactions between 'the quantum world' of microscopic particles and 'the real 
world' in which statisticians analyse data. Those probability models may depend on unknown 
parameters, and quantum statistics is concerned with statistical design and inference concerning 
those parameters. This point of view is commonplace in experimental quantum physics but seems 
to be less common in theoretical physics and in some parts of pure mathematics, in particular 
in the field called 'quantum probability', where a special nature is claimed for the randomness of 
quantum mechanics, placing it outside the ambit of classical probability and statistics. We dis- 
agree firmly with this conclusion though we do agree that there are fascinating foundational issues 
in quantum mechanics. We develop our stance on these issues further in Section |H1 Quantum 
mechanics is concerned with randomness of the most fundamental nature known to science, and 
probabilists and statisticians definitely should be involved in the game, rather than excluded from 
it. 
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In quantum mechanics the state of a physical system is described by a non-negative self-adjoint 
operator p (referred to as the state) with trace 1, on a separable complex Hilbert space TL. In 
accordance with the previous paragraphs, our interest in this paper concerns cases where the state 
is specified only up to some unknown parameter 9 and the question is what can be learned about 
the parameter from observation of the system. 

Many of the central ideas can be illustrated by finite-dimensional quantum systems, the sim- 
plest being based on those in which has (complex) dimension 2. We shall often use the phrase 
'spin-half particle' to refer to such a quantum system, as one of the best known examples concerns 
the magnetic moment or spin of the electron, which in appropriate units can only take on the 
values ±i. But a two-dimensional state space is also appropriate for modelling the polarisation 
of one photon, and yet another example is provided by an atom at very low temperature when 
only its ground state and first excited state are relevant. The theory of quantum computation 
is concerned with how a finite collection of two-dimensional quantum systems, which are then 
called qubits, can be used to carry and manipulate information. We shall mainly concentrate 
on such examples. However, many physical problems concern infinite-dimensional systems, one 
area of great current interest being quantum tomography and quantum holography, which we 
shall discuss briefly. While the theory for finite-dimensional systems can be outlined in relatively 
simple mathematical terms, in general it is necessary to draw on advanced aspects of the theory 
of operators on infinite-dimensional Hilbert spaces and we will only outline this, with quantum 
tomography in mind, in Section [71 

The paper is organised as follows. Section |21 describes the mathematical structure linking 
states of a quantum system, possible measurements on that system, and the resulting state of 
the system after measurement. Section 13 introduces quantum statistical models and notions of 
quantum score and quantum information, parallel to the score function and Fisher information 
of classical parametric statistical models. In Section 0] we introduce quantum exponential models 
and quantum transformation models, again forming a parallel with fundamental classes of models 
in classical statistics. In SectionUlwe describe the notions of quantum exhaustivity and quantum 
sufficiency of a measurement, relating them to the classical notion of sufficiency. We next, in 
Section turn to a study of the relation between quantum information and classical Fisher 
information, in particular through Cramer-Rao type information bounds. In Section[71we discuss 
the infinite-dimensional model of quantum tomography, which poses the challenge of developing 
non-parametric quantum information bounds. In Section |H1 we discuss the difference between 
classical and quantum probability and statistics, relating them both to foundational issues in 
quantum physics and to emerging quantum technologies. Finally in Section we conclude with 
remarks on further topics, in particular, quantum stochastic processes. The appendix contains 
some mathematical details. ^^^^^^^^^^^^^^^^^^^^^^^ 

This paper greatly extends our more m at heniat ical su rvey ijBarndorff-Nielsen. Gill, and Jupd . 

1200 on quantum statistical information. iGilJ ([2001 a'b*) contains further introductory material. 
Many proofs and further details will be found in[Barndorff-Niclscn. Gill, and Judd (2002). Some 
genera l reference s whic h we hav e found extrem ely useful are the books of llshami ([l225l)j IPeresI 
l|l995[) . iGilmord ^199^ . iHolevol l[l982[ l2001cj) . Finally, the Los Alamos National Laboratory 
preprint service for quantum physics, quEint-ph at http : / / xxx . lanl . gov is an invaluable resource. 



2 States, Measurements and Instruments 

In quantum mechanics the state of any physical system to be investigated is described by an 
operator p on a complex separable Hilbert space Ti. such that p is non-negative and (hence) self- 
adjoint and has trace 1. In this paper (except for Section[7|) we shall restrict attention to the case 
where Ti. is finite-dimensional, and our examples will mainly concern the spin of spin-half particles, 
where the dimen sion of is 2. The cla s sic ex ample in this context is the 1922 experiment of Stern 
and Gerlach, see [Brandt and DahmerJ l)l995[ Section 1.4), to determine the size of the magnetic 
moment of the electron. The electron was conceived of as spinning around an axis and therefore 
behaving as a magnet pointing in some direction. Mathematically, each electron carries a vector 
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'magnetic moment'. One might expect the size of the magnetic moment of all electrons to be the 
same, but the directions to be uniformly distributed in space. Stern and Gerlach made a beam of 
silver atoms move transversely through a steeply increasing vertical magnetic field. A silver atom 
has 47 electrons but it appears that the magnetic moments of the 46 inner electrons cancel and 
essentially only one electron determines the spin of the whole system. Classical physical reasoning 
predicts that the beam would emerge spread out vertically according to the component of the spin 
of each atom (or electron) in the direction of the gradient of the magnetic field. The spin itself 
would not be altered by passage through the magnet. However, amazingly, the emerging beam 
consisted of just two well separated components, as if the component of the spin vector in the 
vertical direction of each electron could take on only two different values. 

In this case, Ti. can be thought of as C^, i.e. as pairs of complex numbers, and, correspondingly, 
p is a 2 X 2 matrix 



with P21 — (the bar denoting complex conjugation) and non-negative real eigenvalues pi and 
P2 satisfying pi + p2 — I- 

The result of performing a measurement on the system in state p is a random variable x taking 
values in a measure space {X, A) and with law of the form 



where M is a mapping from the cr-algebra A into the space §A+(7i) of non- negative self-adjoint 
operators on H which satisfies AI{X) = 1 (where 1 is the identity operator) and 



for any finite or countable sequence {Ai,A2, . . . } of disjoint elements of A and A = U°^iAi (the 
sum in the formula being defined in the sense of weak convergence of operators). Such a mapping 
M is said to be a (generalised) measurement. We shall also refer to AI as an operator-valued 
probability measure or OProM for short. In the literature the usual names and acronyms are 
probability operator- valued measure or positive operator- valued measure (POM or POVM), and 
(nonorthogonal, generalised) resolution of the identity. 

The most basic measurements, which are among the class of simple measurements defined in 
Section f2.2l have X a finite set of real numbers, with cardinality less than or equal to the dimension 
of Ti, A as the (T-algebra of all subsets of X , M {{x}) = IIj^;] for any atom {x} of A, the IIjj.] being 
mutually orthogonal projection operators with n[a;] = 1- We speak then of a projector-valued 
probability measure or PProM. The usual terminology in the literature is a PVM or (orthogonal) 
resolution of the identity. All the ingredients of such a simple measurement are encapsulated in 
the specification of a self-adjoint operator Q onTi. with eigenvalues x in X and eigenspaces which 
are precisely those subspaces onto which the Tl^x] project. The operator Q = ^xlljj,] is called 
the observable. Conversely, any self-adjoint operator on Ti. can be given an interpretation as an 
observable. We denote the space of self-adjoint operators (observables) by §A(7i) and the set of 
states p by iS(7i). The adjoint of an operator is indicated by an asterisk 

Physics textbooks on quantum theory usually take the concept of observables as a start- 
ing point. In the infinite dimensional case, observables — self-adjoint operators, not necessarily 
bounded — may have continuous spectrum instead of discrete eigenvalues. But the one-to-one cor- 
respondence between PProM's and observables continues to hold. Any self-adjoint operator on H 
can be given an interpretation as an observable. 

Let M be a measurement. We shall often assume that M is dominated by a cr-finite measure 
u on {X, A) and we shall write m(x) for the density of M with respect to i^. Thus 




Pr(x eA)= ti-{pM{A)} , 



OO 



^Af(A,)-M(A) 



4=1 
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We can take m{x) to be self-adjoint and nonnegative for all x. (If H — C^, then M{A) and m{x) 
may be considered as d x d matrices of complex numbers.) The law of x is also dominated by v 
and the probability density function of x is 



The physical state may depend on an unknown parameter 6, which runs through some pa- 
rameter space &. In this case we denote the state by p{9)- Then the law of the outcome x of 
a measurement M depends on 9 as well and we indicate this by writing Pg{A) or p{x]9) for the 
probability or the probability density, as the case may be. In particular, 



It may also be relevant to stress the dependence on M and we then write p{x; 9; M), etc. We shall 
refer to the present kind of setting as a parametric quantum model (p, M) or (p, to) with elements 
p = (0 ^ p{9)) and M, or its density m. It is also relevant to consider cases where the measurement 
M depends on some unknown parameter, but we shall not discuss this possibility further in the 
present paper. When the measurement M is given, a problem of classical statistical inference 
results concerning the model ^ for the distribution of the outcome. However, it turns out that 
the model for the state 9 i— > p{9) can be usefully studied independently of which measurement is 
made of the system (or in order to choose the best measurement) and then quantum analogues of 
many concepts from classical statistical inference become important. 

OProM's specify the probabilistic law of the outcome of an actual measurement but do not say 
anything about the state of the physical system after the measurement has been performed. The 
mathematical concept of quantum instrument prescribes both the OProM for the measurement 
and the posterior state. 

The next three subsections discuss in more detail the concepts of states, measurements (or 
OProM's), and quantum instruments. 



As stated at the beginning of the section, the state of a quantum system is represented by an 
operator p in S{Ti.). It is often called the density matrix or density operator of the system. We 
think of vectors t/; in 7i as column vectors, and will emphasise this by writing \ip) (Dirac's 'kef 
notation). The adjoint (complex conjugate and transpose) of j?/;) is a row vector, which we denote 
by (■01 (Dirac's 'bra' notation). 

The simplest states, called pure states, are the projectors of rank one, i.e. they are of the form 
p — \'ip){'ijj\, where is a unit vector in Ti. (so = 1), called the state-vector of the pure state 

p. If Ti. has dimension d then the set iSi(7i) of pure states can be identified with the complex 
projective space CP'^^^. In particular, 5i(C^) can be identified with the sphere S"^, which is 
known in theoretical physics as the Poincare sphere, in quantum optics as the Block sphere, and 
in complex analysis as the Riemann sphere. 

Example 1 (Spin-half). Take H = C^, so that Ti. has complex dimension 2, the space of general 
operators on H has real dimension 8, and the space SA(7i) of self-adjoint operators on Ti has real 
dimension 4. 

The space SA(7i) is spanned by the identity matrix 



p{x) ~ tr{pTO(a;)} . 



p{x;9) = tr{p{9)m{x)} . 



(1) 



2.1 States 




together with the Pauli matrices 
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Note that ax, cry and az satisfy the commutativity relations 

[fyjcr^] = 2iax 
Wz,ax\ = 2iay 

where, for any operators A and their commutator [A, B] is defined as AB — BA; and note that 

2 2 2 1 

Any pure state has the form |^)(^/'| for some unit vector |^) in C^. Up to a complex factor of 
modulus 1 (the phase, which does not influence the state), we can write \^) as 

j-^v'/^ cos(z?/2)\ 
The corresponding pure state is 

cos2(??/2) e-^'^cos(i?/2)sin(t9/2) 



^ I ''*'^cos(i?/2)sin(?9/2) sm^{'d/2) 



A little algebra shows that p can be written as p = (1 + Ux^x + Uyay + Uzcrz)/'^ = ^(1 + u • a), 
where a = {ux, ay,az) are the three Pauli spin matrices and u = {ux, Uy,Uz) = u{'&, (f) is the point 
on S*^ with polar coordinates {d, ip). □ 



2.1.1 Mixing and Superposition 

There are two important ways of constructing new states from old. Firstly, since the set of states 
is convex, new states can be obtained by mixing st&tcs pi, • - • , Pm-} tctking convex conibincitions 



PlPl-\ l-PmPm, (2) 

where pi, . . . ,Pm are real with pi > and pi + ■ ■ ■ + Pm = 1- If 7i is finite-dimensional then 
all states are of the form |2l with the pi pure, so that S{'H) is the convex hull of Si{H): in the 
infinite-dimensional case one needs infinite mixtures. For this reason, states which are not pure 
are called mixed states. In particular, if 7i = then the set of pure states is the Poincare sphere, 
whereas the set of mixed states is the interior of the corresponding unit ball. 

li H = then mixing the pure states by the uniform probability measure on CP'^~^ gives a 
state which is invariant under the action p i— > UpU* of SU(d), the group of special (determinant 
-|-1) unitary {UU* = U*U — 1) matrices of order d; this is the unique such invariant state. 

The other important way of constructing new states from old is by superposition. The super- 
position principle states that a complex linear combination of state-vectors is also a physically 
possible state-vector. Let \tpi){tpi\, ■ ■ ■ , \i'm){'4'm\ be pure states on H. Then any state which can 
be written in the form {ip\ip)~^\'ip)(xjj\, where 

^p = Wl^pl H h WmVm 

and wi, . . . , Wm are some complex numbers, is called a superposition of the pure states with state- 
vectors IV'i), . • . , I'tpm) (here the phases of the state- vectors are relevant!). 

The difference between superposition and mixing may be illustrated by a spin-half example: 
take (V'll — (1,0) and (i/'2| — (0, 1). For the superposition with ^1=^2 = l/"\/2, we have 

whereas the mixed state 

+P2 ^2)0^2] = 5 



lo P2) 
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is different from the preceding superposition, whatever pi and p2 = 1 — pi. Taking pi = P2 = 5, 
if we measure the PProM defined by the two projectors and \ip2){fp2\ and corresponding 

outcomes +1 and —1, the two states are indistinguishable: each gives probabiUties of 1/2 for the 
two outcomes. However, if we measure {ip\ip)~-^\4>){''p\ and where \ip)'^ denotes a unit 

vector in orthogonal to then the second state again gives each outcome probability half, 
while the first state gives probabilities 1 and 0. 

The possibility of taking complex superpositions of state-vectors to get new pure states cor- 
responds to the wave-particle duality at the heart of quantum mechanics (linear combinations of 
solutions to wave equations are also solutions to wave equations). The new states obtained in this 
way will have distinctively different properties from the states out of which they are constructed. 
On the other hand, taking mixtures of states represents no more and no less than ordinary prob- 
abilistic mixtures: with probability pi the system has been prepared in state p^, for i = 1, . . . , m. 
It is a fact that whatever physical predictions one makes about a quantum system, they will de- 
pend on the Itpi) and on the pi or Wi involved in mixed states or superpositions only through 
the corresponding matrix p. Since the representation of p as a mixture of pure states and the 
representation of a pure state as a superposition of other pure states are highly non-unique, we 
draw the conclusion that very different ways of preparing a quantum system, which result in the 
state p, cannot be distinguished from one another by any measurement whatsoever on the quan- 
tum system. This is a most remarkable feature of quantum mechanics, of absolutely non-classical 
physical nature. 

2.1.2 The Schrodinger Equation 

Typically the state of a particle undergoes an evolution with time under the influence of an external 
field. The most basic type of evolution is that of an arbitrary initial state pq under the influence 
of a field with Hamiltonian H. This takes the form 

= e'"/'^poe-'"/'^ , 

where pt denotes the state at time t, h is Planck's constant, and _ff is a self-adjoint operator on 
Ti.. If po is a pure state then pt is pure for all t and we can choose unit vectors ipt such that 
Pt = |'0t)(V't| and 

= e*«/^''Vo . (3) 

Equation (PJ is a solution of the celebrated Schrodinger equation iH(d/dt)'ip = Hip or equivalently 
ihld/dt)p= [H,p]. 

2.1.3 Separability and Entanglement 

When we study several quantum systems (with Hilbert spaces Tii, . . . , Hm) interacting together, 
the natural model for the combined system has as its Hilbert space the tensor product TCi (E) ■ ■ ■ (E) 
TLm- Then a state such as pi® ■ ■ ■ ® Pm represents 'particle 1 in state pi and . . . and particle m in 
state Pm ■ Suppose the states pi are pure with state- vectors ipi. Then the product state we have 
just defined is also pure with state- vector -01 (g) • • • (g) ipm- According to the superposition principle, 
a complex superposition of such state vectors is also a possible state-vector of the interacting 
systems. Pure states which cannot be written in the product form pi (i) ■ ■ ■ ® pm are called 
entangled. The same term is used for mixed states which cannot be written as a mixture of pure 
product states. A state which is not entangled, is called separable. The existence of entangled 
states is responsible for extraordinary quantum phenomena, which scientists are only just starting 
to harness (in quantum communication, computation, teleportation, etc.; see Section |H1 for an 
introduction). 

An important physical feature of unitary evolution in a tensor product space is that, in general, 
it does not preserve non-entangledness of states. Suppose that the state pi ® p2 evolves according 
to the Schrodinger operator Ut = e^^l"^^ on Tii ® Jii. In general, if H does not have the special 
form iJi (g) I2 -1- li ® -^2, the corresponding state at any non-zero time is entangled. The notorious 
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Schrddinger Cat, see Section 18.41 is a conseq uence o f this p henomenon of entanglement. For an 
iUustrative discussion of this see, for instance, llshami lll99fii Sect. 8.4.2). 



2.1.4 Spin-j 

So far, our concrete examples have had a two-dimensional Hilbert space. Quantum systems in 
which the Hilbert space Ti. is finite-dimensional are sometimes called spin systems. A spin-j 
system, where j is a positive half-integer, is one for which the Hilbert space is C^-'^^. A physical 
interpretation of a spin-j system is in terms of a particle having spin angular momentum j. 

An important class of spin-j systems can be obtained from pure spin-half systems as follows. 
Let lijj) be a state vector representing a spin-half particle in a pure state p. Then the quantum 
system consisting of n independent particles, all prepared in this state, is represented by the state 
vector (g)"!'!/') in (X)"C^. Such state vectors lie in (and span) the subspace 

0"C2 =span{(^"|V') : IV") e C^} 

of (g)"C^. The corresponding states have the form (^"p and are sometimes known as (angular 
momentum) coherent spin-j states. 

Let {|i/'o)j lipi)} be any basis of C'^. Put j ~ n/2 and, for ni — —j, . . . , j, define |m) in 0"C^ 

by 

H = Ij2 (®'IV'o)) ® J , (4) 

\k=0 ) 

where H© denotes the orthogonal projection from (g)"C^ to 0"C^. The formula 



(which can be obtained by binomial expansion) shows that {|m) : m = — j, . . . , j} spans 0"C^. It 
is easy to check that this is a basis, and so 0"C^ has dimension 2j -I- 1. 

Example 2 (Coherent Spin-1 states). Take j = 2. Then {|?/'o) ® IV'o), (IV'o) ® IV'i) + IV'i) ® 
|V'o))/\/2, IV'i) ® IV'i)} is a basis of 0^C^ Thus Cd^C^ can be identified with C^, whereas ^^C^ 
can be identified with C^. The subspace of 0^C^ orthogonal to 0^C^ is spanned by (|V'o) ® 
IV'i) ~ IV'i) ® IV'o))/"\/2. The corresponding state is known as the singlet or Bell state and helps 
to demonstrate non-classical properties of quantum mechanics; see Section 8.2. 

Spin-1 coherent states can be described in matrix terms as follows. li p = ^(1 + UxO'x +Uyay + 
Uz<Jz) is a pure state on then u^. + -\- u\ = 1 and 

p0/3=i{l + 2{uxSx + UySy + UzSz) + ula^: Q ax + UyOy Q Gy + ulffz o-^} , 



where, in terms of the basis {| — 5), |0), |i)} of 0^C 



2 




and 



(TxGax = \ 1 I <7yQay=\ 1 | a^Qaz 





□ 
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2.2 Measurements 



Operator- valued probability measures, or OProM's, were introduced in the beginning of the present 
section. We shall denote by OProM(A', H) the set of OProM's on X. 

As indicated earlier, a basic kind of operator-valued probability measures consists of those in 
which the operators M{A) are orthogonal projections. Specifically, a projector-valued probability 
measure (or PProM, also called a simple measurement) is an operator- valued probability measure 
M such that 

M{A) = M{A)* ^ M{Af Ac A. 

We shall denote by PProM(A:', 7i) the set of PProM's on X. As we noted, when the outcome space 
X is R, the PProM's stand in one-to-one correspondence with the self-adjoint operators on Ti, 
which in this context are also called observables. If one measures the observable X on a quantum 
system in state p, it turns out that the expected value of the outcome is given by the trace rule 

E(meas(X;p)) = ^xtr{pn[^]} = tr{/9^xn[^]} = iv{pX}. (5) 

X X 

Example 3 (Spin-half, cont.). For any unit vector of C^, the observable 2|?/')(?/'| — 1 = 
~ defines a PProM. It has eigenvalues 1 and —1 and one-dimensional cigcnspaces 

spanned by and ?/;^. This operator measures the spin of the particle in the direction (on the 
Poincare sphere) defined by ip. We mentioned two of such measurements in Section f2.1.1l on mixing 
and superposition. □ 

Example 4 (Spin-half, cont.). In particular, with X = {—1, 1}, the specification 

M({+1}) = i(l+a,) 
M({-1}) = i(l-a.) 

defines an element of PProM(A', C^). It corresponds to the observable ax'- spin in the x-dircction. 
□ 

We next discuss the notion of quantum randomisation whereby adding an auxiliary quantum 
system to a system under study gives one further possibilities for probing the system of interest. 
This also connects to the important notion of realisation: representing generalised measurements 
by simple measurements on a quantum randomised extension. 

Suppose given a Hilbert space 7i, and a pair (/C, pa), where /C is a Hilbert space and pa is a state 
on JC. Any measurement M in OProM(A', 7i (g) /C) induces a measurement M in OProM(A',?i) 
which is determined by 

tr{pM{A)} ^ii\^{p® Pa)M{A)^ peS{n), AeA. (6) 

The pair {)C,pa) is called an ancilla. The following theorem (Holevo's extension of Naimark's 
Theorem, see Appendix lA.2|l states that any measurement M in OProM(A', Ti.) is of the form © for 
some ancilla (/C, pa) and some simple measurement M in PProM{X, H^K,). The triple (/C, Pa, M) 
is called a realisation of M (the words extension or dilation are also used sometimes). Adding an 
ancilla before taking a simple measurement could be thought of as quantum randomisation. 

Theorem 1 (jHolevollloii^ . For every M in OProM(A', 7i), there is an ancilla {K,,pa) and an 
element M of PProM(A', H.® K.) which form a realisation of M . 

We use the term quantum randomisation, because of its analogy with the mathematical rep- 
resentation of randomisation in classical statistics, whereby one replaces the original probability 
space with a product space, one of whose components is the original space of interest, while the 
other corresponds to an independent random experiment with probabilities under the control of 



10 



the experimenter. Just as randomisation in classical statistics is sometimes needed to solve op- 
timisation problems of statistical decision theory, quantum randomisation sometimes allows for 
strictly better solutions than can be obtained without it. 

Here is a simple spin-half example of an OProM which cannot be represented without quantum 
randomisation. 

Example 5 (The triad). The triad, or Mercedes-Benz logo, has an outcome space consisting 
of just three outcomes: let us call them 1, 2 and 3. Let Vi, i = 1,2,3, denote three unit vectors 
in the same plane through the origin in R'^, at angles of 120° to one another. Then the matrices 
M{{i}) = ^{1 + Vi ■ (t) define an OProM on the sample space {1, 2, 3}. It turns up as the optimal 
solution to the decision problem: suppose a spin-half system is generated in one of the three states 
Pi — ^(1 — Vi ■ a), i — 1,2,3, with equal probabilities. What decision rule gives the maximum 
probability of guessing the actual state correctly? There is no way to equal the success probability 
of this method, if one uses only simple measurements, even allowing for (classically) randomised 
procedures. □ 

Finally, we introduce some further terminology concerning measurements. Given an OProM 
M and a measurable function T from its outcome space X to another space 3^, one can define a 
new measurement M' — M oT^^ with outcome space y. It corresponds to restricting attention to 
the function T of the outcome of the first measurement M . We call it a coarsening of the original 
measurement, and conversely we say that M is a refinement of M' . 

A measurement M is called dominated by a (real, sigma-finite) measure v on the outcome 
space, if there exists a non-negative self-adjoint matrix- valued function m{x), called the density of 
M , such that M{B) = m{x)h'{dx) for all B. In the finite-dimensional case every measurement 
is dominated: take i^(B) = trace(M(i3)). 

To exemplify these notions, suppose for some dominated measurement M one can write m{x) — 
TOi(x) -|- m2{x) for two non- negative self-adjoint matrix- valued functions mi and m2- Then one 
can define a refinement A/' of M as the measurement on the outcome space X' = X x {1,2} with 
density mi{x), {x,i) G X' , with respect to the product of i> with counting measure. 

We described earlier how one can form product spaces from separate quantum systems, leading 
to notions of product states, separable states, and entangled states. Given an OProM M on one 
component of a product space, one can naturally talk about 'the same measurement' on the 
product system. It has components M{B) (g) 1. Given measurements M and M' defined on the 
two components of a product system, one can define in a natural way the measurement 'apply 
simultaneously M and M' to each component': its outcome space is the product of the two 
outcome spaces, and it is defined using obvious notation by M (g) M'{B x B') — M{B) (g) M'{B'). 

A measurement M on a product space is called separable if it has a density m such that 
each m{x) can be written as a positive linear combination of tensor products of non-negative 
components. It can then be thought of as a coarsening of a measurement with density m' such 
that each m'ijj) is a product of non-negative components. 



2.3 Instruments 

When a physical measurement is made on a quantum system, the system usually changes state 
in some stochastic manner. Thus a complete description of the measurement specifies not just 
the probability distribution of the outcome x but also the new state of the system when the 
outcome is x. We shall refer to the states of the system before and after measurement as the prior 
state and the posterior state, and use the notation M to denote a particular mapping from prior 
states to probability distributions over outcomes, with a particular posterior sta te associated with 
each outcome and given prior state. Such mappings are called instruments ijPavies and Lewi^ 
Il97nt lDaviejll97fil) . Because of the basic rules of quantum mechanics, an instrument cannot be 
completely arbitrary but must satisfy certain constraints. We shall describe these constraints after 
we have introduced some further notation. 

The word 'instrument' is not very illuminating. The concept which we are trying to catch here 
is that of any interaction between a quantum system and the real world. The interaction will 
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change the state of the quantum system, and cause changes in the real world. One can think of 
these changes as being information recorded in classical physical systems. Data stored on a CD- 
ROM or printed on paper is just one kind of classical physical information. A measurement, in the 
sense of a deliberately carried out experiment, is just one kind of interaction. The data which are 
available to an experimenter, after a measurement has been done, form only part of the totality of 
changes which have happened in the real world. So one can distinguish between what is somehow 
imprinted in the real world as a result of the interaction which takes place when the instrument 
is applied to the quantum system, and a coarsened or reduced version of this information, which 
is the outcome of the measurement as it is available to the experimenter. What is relevant for 
the experimenter is the final state of the quantum system, conditioned on the data which he has 
available. This is typically different from the final state of the quantum system, conditioned on 
the final state of the real world. 

In the following, the outcome of the instrument will refer to the data available to the exper- 
imenter, and the posterior state means the final state (possibly mixed) of the quantum system 
given this information only. 

Consider an instrument N with outcomes x in the measurable space {X,A). Let 7r(da;; p, A/") 
denote the probability distribution of the outcome of the measurement, and let 17(2;; p,M) denote 
the posterior state when the prior state is p and the outcome of the measurement is x. Now let 
Y denote some observable on the quantum system and let ^ e ^ denote a measurable set of 
outcomes. Suppose one 'measures the instrument' on the state p, registers whether or not the 
outcome is in A, and subsequently measures the observable Y . Then the expected value of the 
indicator of the event 'outcome is in A' times the outcome of measuring the observable Y is the 
number J ^it[Ax; p,N)iT:{{a[x] p,N)Y} , by using the trace rule Now it turns out that this 
number, seen as a function of prior state p, measurable subset of outcomes A, and observable Y , 
determines N completely. By the interpretation of mixed states as probability mixtures, it follows 
that the expression is linear in p and therefore can be rewritten as tr{pA/'(A)[F]} where A/'(A)[F], 
for each event A in the outcome space and each observable Y , is a, uniquely defined (possibly 
unbounded) self-adjoint operator on Ti. This linearity constraint restricts considerably the class 
of all possible (tt, a). One can show that A/'(A)[F] must be countably additive in the argument A, 
linear and positive in Y (positive in the sense of mapping nonnegative operators to nonnegative 
operators), and normalised in the sense that A/'(A')[1] = 1. 

Thus, mathematically, an instrument M can be specified equally well by giving the probability 
distribution of the outcome of the measurement 7r(dx; p, A/"), together with the posterior state 
(j{x; p,A/'), as by giving an operator A/'( A) [F] for each A and Y. The physical constraints imposed 
by quantum theory restrict the possible (tt, cr), and equivalently restrict the possible N{A)\Y]. 
The second specification is less direct but more convenient from a theoretical point of view, since 
the physical constraints (additivity, linearity, positivity, normalization) are much more simple to 
express in those terms. In a moment we indicate that, on further physical considerations, the 
positivity co ndition should be strengthened to a condition called complete positivity. 

Following- lOzawal ljl98,'il) . we show how to recover (7r,cr) from M. The first step is to read off 
the measurement or OProM M which is determined by the instrument A/", when we ignore the 
posterior state. This is given by the prescription 



The probability that the measurement of the state p results in an outcome in A is given by 



If the system was in state p just before the measurement then the state of the system after the 
measurement, given that the measurement was observed to result in an outcome belonging to A, 
is determined as the solution a{A\ p, A/") of the equation 



M{A)=M{A)[1 



(7) 



7r(A;p,AA)=tr{pAA(A)[l]}. 



iT{a{A-p,N)Y] 



tr{pM{A)[Y]} 
tr{pAA(A)[l]} 



Y e B(H) 



12 



(provided that tr{pA/'( A) [1]} > 0). Finally, the family cr(a::; p, A/") of posterior states is characterised 
(almost everywhere, with respect to tt) by 



tr{pAf{A)[Y]} = tr{a{x;p,Af)Y}Tr{dx;p,Af) Y eM{H) AeA. 

J A 

An extremely important class of quantum instruments consists of those of the form 

^{dx)[Y] = W.,{xrYW^{x)iy{dx) , (8) 

i 

where v is a tr-finite measure on X (and, without loss of generality, can be taken to be a probability 
measure), the index i runs over some finite or countable set, and Wi is a measurable function from 
X to M{n) such that 

V / w,{x)w,{xy u{dx) = i. 

For such quantum instruments, the posterior states are 

Y.^w,{xY pw,{x) 



a{x;p,Af) 



tr{pW,ix)W,ix)*} 
and the distribution of the outcome is 

TT{dx;p,M) = ^tr{pM^i(a;)VKi(a:)*}t/(da;). 

i 

Such quantum instruments arc almost generic, in the sense that an instrument which satisfies the 
further physically motivated condition of complete positivity can be represented as in except 
that the operators Wi{x) need not be bounded (in which case the formulae we have given need to 
be interpreted with some care). 

The mathematical definition of complete positivity is given in Appendix lA.ll Its intuitive 
meaning is as follows. We can consider the instrument as acting not just on the system of interest 
H but also on a completely arbitrary system /C somewhere else in the universe. If the systems are 
independent, we can express the joint state as a tensor product, and the instrument acts on it by 
transforming the system of interest as we have already specified, while leaving the auxiliary system 
unchanged; the posterior joint state remains a product state. Now once we have specified how 
the extended instrument acts on product states, one can calculate how it acts on an?/ joint state, 
including entangled states, by using the linearity which is a basic feature of quantum physics. To 
be physically meaningful, this extended instrument has to be positive, in the sense of mapping 
states (nonnegative matrices) to states (after all, the system we are studying may actually be in an 
entangled state with a system elsewhere) . The mathematical statement of this physical property 
is called complete positivity. 

Formulae like © are known in the physics literature as Kraus representations. If we allow 
unbounded instruments for which the self-adjoint operator A/'(A)[y] is not necessarily bounded for 
all A and Y, then the Wi{x) need not be bounded either. In this case posterior states may not be 
defined for each outcome of the measurement, but only for each measurable collection of outcomes 
of positive probability. Allowing unbounded operators as well as bounded makes a difference only 
in infinite dimensional spaces, see Examp le llSlin AppendixlA.ll Kcv references on i n struments ari d 
comple t e posi t ivitv are IgtinesDrine (1955), Davics and ^.(^w\4 (11 97nft . lDaviej iTTflT^ ). iKrani j l|l ()H?t\ . 
lOzawal l|l985|) . lLoub'eneta (,1990. .200Q) . and Holeva. (,2001a) . 

Example 6 (Simple Instruments). Let {Hix] ■ x € X} define a PProM on a finite-dimensional 
quantum system, corresponding to the simple measurement of the observable Q = ^^[x] ■ One 
can embed this measurement in many different instruments, i.e., the state could be transformed 
by the measurement in many different ways. However the most simple description possible is 
obtained when one takes, in (jHJ, to be counting measure on the finite set X, the set of indices 
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i to contain a single element, and Wi{x) — W{x) = ^[x]- We call this particular instrument the 
corresponding simple instrument. If one applies it to a system in the pure state with state- vector 
■0, and observes the outcome x, then the state of the system remains pure but now has state vector 
n[^]-0/||n[3,]?/'|| . The probability of this event is precisely ||n[^]-0||^. When the state transforms in 
this way, one says that von Neumann's or Liiders' projection postulate holds for the measurement 
of the observable Q. 

Two observables Q, P are called compatible if as operators they commute. For a Borel mea- 
surable function / : R ^ R and an observable R with eigenvalues r and eigenspaces the ranges of 
the projectors Tl[R=r], the observable f{R) is the operator /('')n[fl=r] • A celebrated result of 
von Neumann is that observables Q and P are compatible if and only if they are both functions 
f{R), g{R) of a third observable R. Taking R to have as coarse a collection of eigenspaces as 
possible, one can show that the results of the following three instruments are identical: the simple 
instrument for Q followed by the simple instrument for P, recording the values g of Q and p of P; 
the simple instrument for P followed by the simple instrument for Q, recording the values q oi Q 
and p of P; and the simple instrument for R, recording the values q = f{r) and p = g{r) where r 
is the observed value of R. 

It follows that the probability distribution of the outcome of measurement of an observable P is 
not altered when it is measured (simply, jointly) together with any other compatible observables. 
Note that the expected value of the outcome of a measurement of the observable Q on a quantum 
system in state p is trjpQ}, and the expected value of the real function / of this outcome is 
tr{pf{Q)}, identical to the expectation of the outcome of a measurement of the observable f{Q). 
We call this rule the law of the unconscious quantum physicist since it is analogous to the law 
of the unconscious statistician, according to which the expectation of a function Y = f{X) of a 
random variable X may be calculated by an integration (i) over the underlying probability space, 
(ii) over the outcome space of X, (iii) over the outcome space of Y. 

A useful consequence of this calculus of functions of observables is that the characteristic 
function of the distribution of a measurement of an observable Q is equal to tr{pe'^'^^}. Since 
Q is self-adjoint, e**'^ is unitary and the trace may have a physical interpretation which aids its 
calculation. □ 

Further results of'Ozawa' ('1985') generalise the realisability of measurements (Naimark, Holevo 
theorems) to the realisability of an arbitrary completely positive instrument. Namely, after forming 
a compound system by taking the tensor product with some ancilla, the instrument can be realised 
as a unitary (Schrodinger) evolution for some length of time, followed by the action of a simple 
instrument (measurement of an observable, with state transition according to von Neumann's 
projection postulate). Therefore to say that the most general operation on a quantum system is 
a completely positive instrument comes down to saying: the only mechanisms known in quantum 
mechanics are Schrodinger evolution, von Neumann measurement, and forming compound systems 
(entanglement). Combining these ingredients in arbitrary ways, one remains within the class of 
completely positive instruments; moreover, anything in that class can be realised in this way. 

Just as we introduced notions of coarsening and refinement for OProM's, and discussed OProM's 
on product systems, one can do the same (and more) for instruments. The extra ingredient is 
composition. Since the description of an instrument includes the state of the system after the 
measurement by the instrument, we are able to define mathematically the composition of two 
instuments, corresponding to the notion of applying first one instrument, and then the second, 
while registering the outcomes (data) produced at each of the two stages. The outcome space 
of the composition of two instruments is the product of the two respective outcome spaces. A 
more complicated form of composition is possible, in which the second instrument is replaced by a 
family of instruments, indexed by possible outcomes of the first instrument. Informally: apply the 
first instrument, then choose a second instrument depending on the outcome of the first; keep the 
outcomes of both. We do not write out the mathematical formalism for describing these rather 
natural concepts. 

For coarsening, we do write out some formal details, since we need later to refer to a specific 
result. Let J\f denote an instrument on a Hilbert space Ti. and with outcome space {X,A) and let 
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A/"' be a coarsening of A/", i.e. A/"' is an instrument on the same Hilbert space 7i, with outcome 
space (3^,6), and there is a mapping T from {X,A) to {y,B) such that 

AA'(i?)[.]=AA(r-i(i3))[.] 

for aU B Cz B. This mathematical formahsm defines the instrument corresponding to applying the 
instrument Af, registering the result of applying the function T to the outcome x, and discarding 
X. Because of this interpretation, one has the following relation between the posterior states 
a{x]p,N) and a{t; p,J^'): 



a{t;p,M')= / a{x;p,U)n{dx\t-p,M), (9) 

JT-i(t) 

where 7r(da;|t; p,M) is the conditional distribution of x given T{x) = t computed from 7r(da;; p,M). 

An instrument defined on one component of a product system can be extended in a natural way 
(similar to that described in Section r2.2l for measurements) to an instrument on the product system. 
Conversely, it is of great interest whether instruments on a product system can in some way be 
reduced to 'separate instruments on the separate sub-systems'. There are two important notions in 
this context. The first (similar to the concept of separability of measurements) is the mathematical 
concept of separability of an instrument defined on a product system: this is that each Wi (x) in 
some representation is a tensor product of separate matrices for each component. The second 
is the physical property which we shall call multilocality: an instrument is called multilocal, if it 
can be represented as a coarsening of a composition of separate instruments applied sequentially 
to separate components of the product system, where the choice of each instrument at each stage 
may depend on the outcomes of the instruments applied previously. Moreover, each component of 
the system may be measured several times (i.e., at different stages), and the choice of component 
measured at the nth stage may depend on the outcomes at previous stages. One should think of the 
different components of the quantum system as being localised at different locations in space. At 
each location separately, any thing quantum is allo wed, but all communication between locations 
is classical. It is a theorem of lBennett et al.l ( 999all that every multilocal instrument is separable, 
but that (surprisingly) not all separable instruments are multilocal. It is an open problem to find 
a physically meaningful characterisation of separability, and conversely to find a mathematically 
convenient characterisation of multilocality. (Note, our terminology is not standard: the word 
'unentangled' is used by some authors instead of separable, and 'separable' instead of multilocal). 

Not all joint measurements (by which we just mean instruments on product systems), are sep- 
arable, let alone multilocal. Just as quantum randomised measurements can give strictly more 
powerful ways to probe the state of a quantum system than (combinations of) simple measure- 
ments and classical randomisation, so non-separable measurements can do strictly better than 
separable measurements at extracting information from product systems, even if a priori there is 
no interaction of any kind between the subsystems; this is a main conclusion of Section 16.31 



3 Parametric Quantum Models and Likelihood 

A measurement from a parametric quantum model (p, m) results in an observation x with density 

p{x;e) = tT{p{e)m{x)} 

and log likelihood 

l{e) =logtr{p(6l)m(a;)}. 

For simplicity, let us suppose 9 is one-dimensional. For the calculation of log likelihood deriva- 
tives in the present setting it is convenient to work with the symmetric logarithmic derivative or 
quantum score of p, denoted by p^/g. This is defined implicitly as the self-adjoint solution of the 
equation 

P/e^ P° Pile , (10) 
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where o denotes the Jordan product, i.e. 

P o P//e = \{PP//B + P//eP) , 

p /g denoting the ordinary derivative of p with respect to 9 (term by term differentiation in matrix 
representations of p). (We shaU often suppress the argument 9 in quantities Uke p, p/g, etc.) 
The quantum score exists and is essentially unique subject only to mild conditions (for a discussion 
of this see, for example. iHolevQ..1982,l . 

The likelihood score l/e{9) = {d/d9)l{9) may be expressed in terms of the quantum score p//e{9) 
of p{9) as 

lfg{9) = p{x;9)-hr{pfg{9)m{x)} 

= p{x;9r^tr{ip{9)p^/gi9) + p/^g{9)p{9))m{x)} 
= p{x;9y'mv{pi9)p//e{9)mix)}, 

where we have used the fact that for any self-adjoint operators P,Q,R on H the trace operation 
satisfies tr{PQi?} = tr{RQP} and 3?tr{Q} = itr{Q + Q*}. It follows that 

Eg[l/e{9)]^tr{p{9)p^^e{0)}. 

Thus, since the mean value of l/g is 0, we find that 

tr{p{9)p^/e{9)} = . (11) 

The expected (Fisher) information i{9) = i{9; M) ~ E0[Z/e(6')^] may be written as 

i{9- M) = I p{x; 9)-' {ditr{p{9)p^/e{e)niix)}}' ,.{dx) . (12) 

It plays a key role in the quantum context, just as in classical statistics, and is discussed in Section 
El In particular, we will there discuss its relation with the expected or Fisher quantum information 

Ii9)^tr{pi9)p;/e{9f}. (13) 

The quantum score is a self-adjoint operator, and therefore may be interpreted as an observable 
which one might measure on the quantum system. What we have just seen is that the outcome 
of a simple measurement of the quantum score has mean zero, and variance equal to the quantum 
Fisher information. 

4 Quantum Exponential and Quantum Transformation Mod- 
els 

In traditional statistics, the two major classes of parametric models are the exponential models 
(in which the log densities are affine functions of appropriate parameters) and the transformation 
(or group) models (in which a gr oup acts in a cons i stent fashion on both the sample space and the 
parameter space) ; see iBarndorff-Nielsen and Co^ ()l994|) . The intersection of these classes is the 
class of exponential transformation models, and its members have a particularly nice structure. 
There are quantum analogues of these classes, and they have useful properties. 

4.1 Quantum Exponential Models 

A quantum exponential model is a quantum statistical model for which the states p{9) can be 
represented in the form 

p{9) = e-<''>ei^''^'^^'poei^''^'^^'- 9^0, 
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where 7 = (7^,...,7'^) : Q C*^, Ti,...,Tk are operators on H, po is self-adjoint and non- 
negative (but not necessarily a density matrix), the Einstein summation convention (of summing 
over any index which appears as both a subscript and a superscript) has been used, and k{9) is a 
log norming constant, given by 

^(0)=logtr{e^"''W^*poe^^''(''^^-}. 

Three important special types of quantum exponential model are those in which Ti, . . . ,Tk are 
bounded and self-adjoint, (and for the first type, Tq, Ti, ... ,Tk all commute) and the quantum 
states have the forms 

p{9) = e-«Wexp{To + 0T,} (14) 
p{0) = e-«Wexp{irT,}poexp{i0'^r,} (15) 
pie) = expl-iieT^jpoexpliieT^} , (16) 

respectively, where 9 = (9^ , . . . , 6'^) £ M.'^ and po <= §A+(7Y), and the summation convention is in 
force. 

We call these three types, the quantum exponential models of mechanical type, symmetric 
type, and unitary type respectively. The mechanical type arise s (at least, with fc = 1) in qu antum 
statistical mechanics as a state of statistical equilibrium, see iGardiner and Zolled ll200nL Sect. 
2.4.2). The symmetric type has theoretical statistical significance, as we shall see, connected 
among other things to the fact that the quantum score for this model is easy to compute explicitly. 
The unitary type has physical significance connected to the fact that it is also a transformation 
model (quantum transformation models are defined in the next subsection). The mechanical type 
is a special case of the symmetric type when Tq, Ti, ... ,Tk all commute. 

In general, the statistical model obtained by applying a measurement to a quantum exponential 
model is not an exponential model (in the classical sense). However, for a quantum exponential 
model of the form H15(l in which 

Tj^tjiX) j^l,...,k for some X in §A(H) , (17) 

i.e., the Tj commute, the statistical model obtained by applying the measurement X is a full 
exponential model. Various pleasant properties of such quantum exponential models then follow 
from standard properties of the full exponential models. 

The classical Cramer-Rao bound for the variance of an unbiased estimator i of 6* is 

Var(t) > i(6';M)-^ (18) 

Combining lfTH|) with ' Braunstein and CaveS ^ll994^ qu an tum i nformation bound i{9]M) < I{9), 
which we derive as (|31f) in Section 1^?^ vields iHelstroml 's l)l976|) quantum Cramer-Rao bound 

Var(i) > I{9)-^ , (19) 

whenever t is an unbiased estimator based on a quantum measurement. It is a classical result that, 
under certain regularity conditions, the following are equivalent: (i) equality holds in 118(1 . (ii) the 
score is an affine function of (iii ) the model is exponential with t as canonical statistic (cf. pp. 
254-255 of lCox and HinklevI 11974 - This result has a quantum analogue, see Theorems 01 and 0] 
and Corollary n below, which states that under certain regularity conditions, there is equivalence 
between (i) equality holds in (|19|l for some unbiased estimator t based on some measurement 
M , (ii) the symmetric quantum score is an affinc function of commuting Ti, . . . ,Tk, and (iii) the 
quantum model is a quantum exponential model of type ((15|1 where Ti , . . . , satisfy H17|l . The 
regularity conditions which we assume below are indubitably too strong: the result should be true 
under minimal smoothness assumptions. 
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4.2 Quantum Transformation Models 

Consider a parametric quantum model (p, M) consisting of a family p ~ {p{d) ■ € 6} of states 
and a measurement M with outcome space {X , A). Suppose there exists a group, G, with elements 
acting both on X and on in such a way that the following consistency condition holds 

iy{p{e)M(A)) = iy{p{ge)M{g-^A)} (20) 

for all 9, A and g. If, moreover, G acts transitively on Q we say that (p, M) is a quantum transfor- 
mation model. In this case, the resulting statistical model for the outcome of a measurement of M, 
i.e. {X,A,V), where V ~ ti{p{9)M} : 9 G Q}, is a classical transformation model. C onsequentlv, 
the Main Theorem for transformation models, see iBarndorff- Nielsen and Coxl l) 19941 pp. 56-57) 
and references given there, applies to {X,A^V). 

Of particular physical interest are situations where the actions of G are such that 

M{g-^A) = U;M{A)Ug A ^ A, (21) 

p{g9) - U;p{9)Ug, (22) 
where the Ug are unitary matrices satisfying 

Ugh^w{g,h)UgUh g,heG, (23) 

for some complex valued function w with \w{g, h)\ — I for all g and h. A mapping g t—^ Ug with the 
property (|23|l is said to constitute a projective unitary representation of G and a measurement M 
satisfying H21(l is termed covariant in the physical literature; equivariant would be a more correct 
terminology. Under certain conditions, equivariant measurements are representable in the form 

M{A) = f u;RoUgfi{dg) 

J{g:g-^xoeA} 

for an invariant measure on G, a fixed non-negative self-adjoint operator Rq on Ti. and some 
fixed point xq G X. 

Example 7 (Equivariant measurements for spin- half). Suppose both outcome space X and 
group G are the unit circle S^. Let the Hilbert space H be and let S*^ act on H via the 
projective representation 

Then bv iHolevcl l)l982l p. 175 with j — any equivariant M has 

"^('^)=(ae-^ "l') 

with respect to the uniform distribution on 5*^, for some a with \a\ < 1. □ 

Example 8 (Equivariant measurements for spin-j). The preceding example generalises to 
spin-j coherent states. Again, both the outcome space X and the group G are the unit circle S^. 
Now let the Hilbert space TC be 0"C^. Define the operator J on 7i by 

j 

J— ni\ni){m\, 

where j ^ n/2 and |m) is defined in Q. Then putting 

f/0 = e*"^^ (PeS^ 
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gives a projective representation of on Ti. Bv lHolevd l)l982L p. 175) any equivariant measure- 
ment has density 

with respect to the uniform distribution on S^, for some positive operator Rq satisfying 



1 f^'' 

— / e-'^^Roe'^^d(f> = 1 . 
27r Jo 



□ 



4.3 Quantum Exponential Transformation Models 

A quantum exponential transformation model is a quantum exponential model which is also a 
quantum transformation m odel. The pleasant properties of classical exponential transformation 
models (Barndorff-Niclscn et all Il982|) are shared by a large class of quantum exponential trans- 
formation models of the form (|15|l which satisfy (|17|l . In particular, if Ti. is finite-dimensional 
and the group acts transitively then there is a unique afline action of the group on M.'' such that 
{ti, . . . ,tk) : X ^M.'^ is equivariant. 

Example 9 (Spin-half: great circle model). Consider the spin-half model p{9) = U ^{1 + 
cos 9(7x+sm Bay) U* where J7 is a fixed 2x2 unitary matrix, and cFx and (Jy are two of the Pauli spin 
matrices, while the parameter 9 varies through [0, 27r); see Example^ The matrix U can always be 
written as exp{—i(f)il-(T) for some real three-dimensional unit vector H and angle (j). Considered as a 
curve on the Poincare sphere, the model forms a great circle. If U is the identity (or, equivalently, 
= 0) the curve just follows the line of the equator; the presence of U corresponds to rotating the 
sphere carrying this curve about the direction u through an angle 4>. Thus our model describes an 
arbitrary great circle on the Poincare sphere, parametrised in a natural way. Since we can write 
p{9) — UVell* p(0)UVgU* , where the unitary matrix Vg corresponds to rotation of the Poincare 
sphere by an angle 9 about the z-axis, we can write this model as a unitary transformation 
model of the form (|22|l . Together with any equivariant measurement, this model forms a quantum 
transformation model. The model is clearly also an exponential model of unitary type. Perhaps 
surprisingly, it can be reparameterised so as also to be an exponential model of symmetric type. 
We leave the details (which depend on the algebraic properties of the Paul spin matrices) to the 
reader, but just point out that a one-parameter pure-state exponential model of symmetric type 
has to be of the form p{9) = exp(— k(6')) exp{^9u ■ (?)^(1 + v ■ ir) exp{^9u ■ a) for some real unit 
vectors u and u, since every self-adjoint 2x2 matrix is an afline function of a spin matrix u ■ a. 
Now write out the exponential of a matrix as its power series, and use the fact that the square of 

any spin matrix is the ide ntity. 

This example is due to lFuiiwara and Nag-aokal ljl99,'il) . □ 



5 Quantum Exhaustivity and Sufficiency 

This section introduces and relates some concepts connected to the classical notion of sufficiency. 



5.1 Quantum Exhaustivity 

An important role is played by quantum instruments for which no information on the unknown pa- 
rameter of a quantum parametric model of states can be obtained from subsequent measurements 
on the given physical system. 

Recall that an instrument M is represented by a collection of observables A/'(^)[y], defined in 
the following implicit fashion. For any particular A and y, the expectation of the outcome of mea- 
suring the observable M(^A) \Y\ on a system in state p, is the same as the expectation of a function 
of the joint outcomes of first applying the instrument to a system in state p and next measuring 
the observable Y on the posterior state: namely, take the product of the indicator variable that 
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the outcome of the instrument is in A, with the outcome of the subsequent measurement of Y. 
This cohection of observables determines uniquely the probabihty distribution ^{dx; p,J\f) of the 
outcome of applying the instrument J\f to the state p, and the posterior state cr{x; p,JV) given that 
the outcome is x. They are related to the A/'(A)[y] by the equality (which we just expressed in 
words) 



tY{pM{A)[Y]} = I iv{a{x;p,N)Y)'K{dx]p,N). 

J A 



In the sequel we will drop the name of the instrument in the notation for tt and a and, when 
considering a parameterised family of prior states, replace the prior state p{9) by the parameter 
value 9: thus 7r(da;; 9) denotes the probability distribution of the outcome, and a{x\ 9) denotes the 
posterior state. 

Definition 1 (Exiiaustive instruments). A quantum instrument M is exhaustive for a para- 
meterised set p : 8 ^ S{T-l) of states if for all 6* in 8 and for 7r(-; 0)-almost all x, a{x; 9) does not 
depend on 9. 

Thus the posterior states obtained from exhaustive quantum instruments are completely de- 
termined by the result of the measurement and do not depend on 9. 
A useful strong form of exhaustivity is defined as follows. 

Definition 2 (Completely exhaustive instruments). A quantum instrument M is completely 
exhaustive if it is exhaustive for all parameterised sets of states. 

Recall that any completely positive instrument — in other words, virtually any physically real- 
isable instrument — has the form ^ of A/'(A)[F], given by 

M{dx)[Y] ^^tv{Wi{x)*YWi{x)}v{Ax) (24) 

i 

with posterior states 

''^'''^^ E.tr{pW^.(x)W^,(x)*} 

and outcome distributed as 

TT{dx-p) ^Y.^r{pW,{x)W,{x)*}v{dx). 

i 

The following Proposition (which is a slight generalisation of a result of lWisemanlll999|) shows one 
way of constructing completely exhaustive completely positive quantum instruments. 

Proposition 1. Let the quantum instrument M he as above, with Wi{x) of the form 

W.,{x) (25) 
for some functions {i,x) i— > t/i^.a; and x t-^ ipx- Then J\f is completely exhaustive. 
Proof. By inspection we find that the posterior state is 

which does not depend on the prior state p. □ 
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5.2 Quantum Sufficiency 



Suppose the measurement M' = M oT^^ is a coarsening of the measurement M . In this situation 
we say that M' is (classically) sufficient for M with respect to a family of states p — {p{0) : 
9 € G} on Ti. if the mapping T is sufficient for the identity mapping on {X,A) with respect to 
the family {P{-;6; M) : 9 e 0} of probability measures on {X,A) induced by AI and p (that is, 



As a further step towards a definition of quantum sufficiency, we introduce a concept of infer- 
ential equivalence of parametric models of states. 

Definition 3 (Inferential equivalence). Two parametric families of states p = {p{9) : 9 E Q} 
and (T = {cr{9) : 9 € Q} on Hilbert spaces Ti. and /C are said to be inferentially equivalent if for 
every measurement M on H there exists a measurement N on K, such that for all 9 E Q 



and vice versa. (Note that, implicitly, the outcome spaces of M and A'' are assumed to be identical.) 

In other words, p and cr are equivalent if and only if they give rise to the same class of possible 
classical models for inference on the unknown parameter. 

Example 10 (Two identical spin-half particles vs. one coherent spin- 
one particle). Let p = {p{9) : 9 E Q} be a parametric family of coherent spin-1 states; see 
Section [!j . 1 . 41 above . Then the associated Hilbert space H is C^(X)C^. Recall that the state vectors 
of coherent spin-1 states lie in the subspace K. = © of © C^. Define the parametric 
family a = {<^{9) : 9 E &} by a{9) = IIqp{9)l, where Hq and i are the orthogonal projection from 
(g) to and the inclusion of K, in H, respectively. Given a measurement M on H, 
we can define a measurement A'^ on /C by N(-) = IIqM(-)l. Similarly, given a measurement N on 
/C, we can define a measurement M on H by M(-) = iiV(-)n0. It is simple to verify that H26(l is 
satisfied, and so p and a are inferentially equivalent. □ 

Remark 1. It is of interest to find characterisations of inferential equivalence. This is a nontrivial 
problem, even in the case where the Hilbert spaces Ti. and /C are the same. 

Next, let M denote an instrument on a Hilbert space H and with outcome space {X, A) and let 
Af' — No be a coarsening of Af with outcome space (3^, B), generated by a mapping T from 
{X,A) to {y,B). According to I© in Section [Ql the posterior states for the two instruments are 
related by 



where 7r(da;|t; 9,J\f) is the conditional distribution of x given T{x) = t computed from 7r(dx; 9,J\f). 

Definition 4 (Quantum sufficiency of instruments). Let J\f' be a coarsening of an instrument 
M hy T : (X,A) — > {y,B). Then M' is said to be quantum sufficient with respect to a family of 
states {p{9) : 6* G 9} if 

(i) the measurement M'(-) = A/''(-)[l] is sufficient for the measurement M(-) = A/'(-)[l], with 
respect to the family {p{9) : 9 E Q} 

(n) for any x E X, the posterior families {a{x;9,J^) : 9 E Q} and {a{T{x);9,J^') : 9 E 0} are 
inferentially equivalent. 

5.3 Exhaustivity, Sufficiency, AnciUarity and Separability 

In the theory of classical statistical inference, many important concepts (such as sufficiency, an- 
cillarity and cuts) can be expressed in terms of the decomposition by a measurable function 
T : {X,A) — > {y,B) of each probability distribution on {X,A) into the corresponding marginal 



P{-;9;M)=tr{M{-)p{9)}). 



tr{M{-)p{0)} = tr{7V(>(0)} 



(26) 
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distribution of T{x) and the family of conditional distributions of x given T{x). In quantum 
statistics there are analogous concepts based on the decomposition 

pK^(7r(.;p,AA),a(.;p,AA)) (27) 

by a quantum instrument M of each state p into a measurement and a family of posterior states; 
see Section 2.3. 

The classical concept of a cut encompasses those of sufficiency and ancillarity and is therefore 
more basic. A measurable function T is a cut for a set V of probability distributions on X if 
for all pi and p2 in 7^, the distribution on X obtained by combining the marginal distribution of 
T{x) given by pi with the family of conditional distribu tions of x given T(x) given by p2 is also 
in V; see, e.g. p. 38 of iBarndorff-Nielsen and Co:x|| l994i Rec ent results on cuts for exponential 
models can be found in ijBarndorff-Nielsen and KoudoiA Il995|) . which also gives references to the 
useful role which cuts have played in graphical models. A genera lisation to local cuts has become 
important in econometrics l|Christensen and Kiefeil I1994L bonni) . Replacing the decomposition 
into marginal and conditional distributions in the definition of a cut by the decomposition H27|l 
yields the following quantum analogue. 

Definition 5 (Quantum cuts). A quantum instrument Af is said to be a quantum cut for a 
family p of states if for all pi and p2 in p, there is a p^ in p such that 

TT{-;p3,Af) = tt{-,pi,JV) 
a{-;p3,JV) = a{-;p2,M). 

Thus, if A/" is a quantum cut for a family p = {p(^) : & G ©} with p a one-to-one function 
then Q has the product form O = ^E* x $ and furthermore a{-; p{9),Af) depends on 9 only through 
ip, and tt{-; p{d),J\f) depends on 6 only through (f). 

Example 11 (Simple quantum cuts). Let {H^^] : x <E X} be a PProM on a finite-dimensional 
quantum system. Suppose that sets 'J and ^! are given, together with collections of functions 
(indexed by x in A") /a, : $ [0, 1] and : 5* ^ S{H) which satisfy 

^U{<P) - 1 0e$ 
M,(^) = n[,]M,(^)n[,] ^ e 

Then we can define a family of states {p{ip, </>) : £ 5*, £ $} by 

p(^,</)) = ^/,(0)Af,(V) (V',0)e*x<i>. 

As indicated in Example {Tl[x] ■ x e A"} gives rise to a simple quantum instrument A/", defined 
by 

AA({a;})[r] = n[,]rn[,]. 

A straightforward calculation using the orthogonality of the projections IIj^] shows that 

a(x;p(^,0),AA) = M,(V) 

and so A/" is a quantum cut for p. □ 

Since a quantum instrument is exhaustive for a parameterised set p = {p{9) : 9 G Q} of 
states if the family a{-; p{9),Af) of posterior states does not depend on 9, exhaustive quantum 
instruments are quantum cuts of a special kind. They can be regarded as quantum analogues of 
sufficient statistics. At the other extreme are the quantum instruments for which the measurements 
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tt{-; p{9),Af) do not depend on 0. These can be regarded as quantum analogues of ancillary 
statistics. 

Unlike exhaustivity, the concept of quantum sufficiency involves not only a quantum instrument 
but also a coarsening. The definition of quantum sufhciency can be extended to the following 
version involving parameters of interest. 

Definition 6 (Quantum sufficiency for interest parameters). Let p = {p{9) : 9 G Q} be a 
family of states and let -0 : 8 ^ ^ map O to the space ^ of interest parameters. A coarsening A/"' 
of an instrument A/" by a mapping T is said to be quantum sufficient for ijj on p if 

(i) the measurement A/''(-)[l] is sufficient for A/'(-)[l] with respect to the family p, 

(ii) for all 9i and 6*2 with ip{9i) — "0(6*2) and for all x in X, the sets a{x; p{9i),Af) and 
a{T{x); p{92),J^') of posterior states are inferentially equivalent. 

Consideration of the likelihood function obtained by applying a measurement to a parame- 
terised set of states suggest that the following weakening of the concept of inferential equivalence 
may be useful. 

Definition 7 (Weak likelihood equivalence). Two parametric families of states p = {p{9) : 
9 E Q} and cr = {a{9) : 9 E Q} on Hilbert spaces H and JC respectively are said to be weakly 
likelihood equivalent if for every measurement M on Ti. there is a measurement N on K. with the 
same outcome space, such that 



(whenever these ratios are defined) and vice versa. 

Thus the likelihood function of the statistical model obtained by applying M to p is equivalent 
to that obtained by applying A^ to cr, for the same outcome of each instrument. 

Consideration of the distribution of the likelihood ratio leads to the following definition. 

Definition 8 (Strong likelihood equivalence). Two parametric families of states p ~ {p{9) : 
9 E 0} and cr = {a{9) : 9 E &} on Hilbert spaces H and K, respectively are said to be strongly like- 
lihood equivalent if for every measurement M on Ti with outcome space X there is a measurement 
N on K, with some outcome space y such that the likelihood ratios 



have the same distribution for all 9, 9' in O, and vice versa. 

The precise connection between likelihood equivalence and inferential equivalence is not yet 
known but the following conjecture appears reasonable. 

Conjecture. Two quantum models are strongly likelihood equivalent if and only if they are 
inferentially equivalent up to quantum randomisation. 

6 Quantum and Classical Fisher Information 

In SectionOwe showed how to express the Fisher information in the outcome of a measurement in 
terms of the quantum score. In this section we discuss quantum analogues of Fisher information 
and their relation to the classical concepts. 



tY{M{dx)p{9)} _ tr{N{dx)cr{9)} 
tr{M(d2:)p(6'')} ~ tT{N{dx)a{9')} 



9,9' eQ 



tT{M{dx)p{9)} 
ti{M{dx)p{9')} 



and 



tr{N(dy)a(g)} 
tr{N(dy)a(^')} 
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6.1 Definition and First Properties 

Differentiating with respect to 0, writing p//e/e for the derivative of the symmetric logarithmic 
derivative p f/g of p, and using the defining equation (|10|l for p and the fact that p and p /jq are 
self-adjoint, we obtain 

= mr{p,g{e)p/,g{e) + p[e)p„g,g{e)} 
= mr[\(p{e)p//g{e) + p//g{e)p{e))p„g{d)] + mv{p{e)p//g/g{e)} 
= m-ir{p{e)j{e)), 

where 

m = iv{p(e)p„g{ef] 

is the expected (or Fisher) quantum information^ already mentioned in Sections |2| and ^ and 

J{0) = -p,/o/o{0) , 
which we shall call the observable quantum information. Thus 

iie) = tT{pie).m} , 

which is a quantum analogue of the classical relation i{9) — Eg[j{9)] between expected and ob- 
served information (where j{d) = —l/g/g{9)). Note that J{9) is an observable, just as j{9) is a 
random variable. 

Neither I{9) nor J{9) depends on the choice of measurement, whereas i{9) — i{9; M) does 
depend on the measurement M . 

For parametric quantum models of states of the form 

p:9^ pi{9) (g) • • ■^pn{9) 

(which model 'independent particles'), the associated expected quantum information satisfies 

TL 

which is analogous to the additivity property of Fisher information. In particular, for parametric 
quantum models of states of the form 

p:9^ p{9)®---®p{9) (28) 

(which model n 'independent and identical particles'), the associated expected quantum informa- 
tion /„ satisfies 

I,,{9) = nl{9) , (29) 

where I{9) denotes the expected quantum information for a single measurement of the same type. 

In the case of a multivariate parameter 9, the expected quantum information matrix I{9) is 
defined in terms of the quantum scores by 

I{9),k = itr {p,/g^ {9)p{9)p//g, (9) + piig^ {9)p{9)p,/g^ {9) } . (30) 



6.2 Relation to Classical Expected Information 

Suppose that 9 is one-dimensional. There is an important relationship between expected quantum 
information I{9) and classical expected information i{9] M), due to Braunstein and C aves ( 1 99^ . 
namely that for any measurement M with density m with respect to a a- finite measure v on X , 

i{9;M)<I{9), (31) 
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with equality if and only if, for z^-almost all x, 



m{xtl^Plie{e)p{eYl^ - r{x)r,^{xYl^p{dYl^ , (32) 

for some real number rix). For a proof see Appendix IbI 

For each 0, there are measurements which attain the bound in the quantum information in- 
equality H31|) . For instance, we can choose M such that each m(x) is a projection onto an eigenspace 
of the quantum score p//g{6). Note that this attaining measurement may depend on 9. 

Example 12 (Information for spin-half). Consider a spin-half particle in the pure state 
P = Piv. ^Hiv, S)) ii^iv, 9) I given by 



As we saw in Example^ (where we wrote (77, t?) for (?/, 9)), p can be written as p = (1 + UxUx + 
Uyay + Uz(Jz)/2 = ^(1 + u ■ (t), where a = {(Tx,<Jy,<Jz) are the three Pauli spin matrices and 
u = {ux,Uy,Uz) = u{ri,9) is the point on the Poincare sphere with polar coordinates {ri,9). 
Suppose that the colatitude 77 is known and exclude the degenerate cases 77 = or 77 = tt; the 
longitude 9 is the unknown parameter. 

Since all the p{9) are pure, one can show that p//e[(^) = '^P/oiS) — u/g[9) ■ a ~ sin(?7) u{n/2,9 + 
it/2) ■ a. Using the properties of the Pauli matrices, one finds that the quantum information is 

I{9)=tr{p{9)p/^e{0?}^sm^r,. 

Following iBarndorff-Nielsen and Giii' ('2000), we now state a condition that a measurement must 
satisfy in order for it to achieve this information. 

It follows from H32() that, for a pure spin- half state p = a necessary and sufficient 

condition for a measurement to achieve the information bound is: for i^-almost all x, m{x) is 
proportional to a one- dimensional projector |C(2;))(C(2^)I satisfying 

(C|2)(2|a) = r(x)(^|l), 

where r{x) is real, |1) = IV'), |2) = IV')''" (IV')''" being a unit vector in orthogonal to \tp)) and 
\a) = 2\ili) / g . It can be seen that geometrically this means that \£,{x)) corresponds to a point on 
5^ in the plane spanned by u{9) and u/g{9). 

If 77 ^ 7r/2, this is for each value of a different plane, and all these planes intersect in the 
origin only. Thus no single measurement M can satisfy I{9) — i{9; M) for all 9. On the other 
hand, if 77 = 7r/2, so that the states p{9) lie on a great circle in the Poincare sphere, then the 
planes defined for each 9 are all the same. In this case any measurement Af with all components 
proportional to projector matrices for directions in the plane rj = 7r/2 satisfies I{9) — i{9; M) for 
all 9 G Q. In particular, any simple measurement in that plane has this property. 

More generally, a smooth one-parameter model of a spin-half pure state with everywhere 
positive quantum information admits a uniformly attaining measurement, i.e. such that I{9) = 
i{9; M) for all ^? G 0, if and only if the model is a great circle on the Poincare sphere. This is 
actually a quantum exponential transformation model, see Example |3 □ 

When the state p is strictly positive, and under further nondegeneracy conditions, essentially 
the only way to achieve the bound H31|l is through measuring the quantum score. In the discussion 
below we first keep the value of 9 fixed. Since any nonnegative self-adjoint matrix can be written 
as a sum of rank-one matrices (using its eigenvalue-eigenvector decomposition) , it follows that any 
dominated measurement can be refined to a measurement for which each 777(0;) is of rank 1, thus 
m{x) = r{x)\S^{x)){£^{x)\ for some real r{x) and state-vector |^(x)), see the end of Section If 
one measurement is the refinement of another, then the distributions of the outcomes are related 
in the same way. Therefore, under refinement of a measurement. Fisher expected information 
cannot decrease. Therefore if any measurement achieves ()31|l. there is also a measurement with 
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rank 1 components achieving the bound. Consider such a measurement. Suppose that p > and 
that all the eigenvalues of p//g are different. The condition m{xy^^ p//gp^^'^ = r{x)m{xY^^ p^^^ is 
then equivalent to \^{x)){£^{x)\p/^g = r(a;)|^(a;))(^(x)|, which states that £^{x) is an eigenvector of 
p/fg. Since we must have m{x)p{dx) = 1, it follows that all eigenvectors of p^/g occur in this way 
in components m{x) of M. The measurement can therefore be reduced or coarsened (the opposite 
of refined) to a simple measurement of the quantum score, and the reduction (at the level of the 
outcome) is sufficient. 

Suppose now the state p{9) is strictly positive for all 9, and that the quantum score has 
distinct eigenvalues for at least one value of 0. Suppose a single measurement exists attaining H^lfl 
uniformly in 9. Any refinement of this measurement therefore also achieves the bound uniformly, in 
particular, the refinement to components which are all proportional to projectors onto orthogonal 
one-dimensional eigenspaces of the quantum score at the value of 9 where the eigenvalues are 
distinct. Therefore the eigenvectors of the quantum score at this value of 9 are eigenvectors at 
all other values of 9. Therefore there is a self-adjoint operator X with distinct eigenvalues such 
that p//g{9) = fiX;9) for each 9. Fix 6*0 and let F{X;9) = Jg^ f{X]9)A9. Let po = p{0o). If we 
consider the defining equation H10|l as a differential equation for p{9) given the quantum score, and 
with initial condition p{9q) ~ po, we see that a solution is p{9) = exTp{^F{X; 9)}po exp{^F{X; 9)}. 
Under smoothness conditions the solution is unique. Rewriting the form of this solution, we come 
to the following theorem: 

Theorem 2 (Uniform attainability of quantum information bound). Suppose that the 
state is everywhere positive, the quantum score has distinct eigenvalues for some value of 9, and 
is smooth. Suppose that a measurement M exists with i{9\ M) = I{9) for all 9 , thus attaining the 
Braunstein-Caves information bound ^Sl\) uniformly in 9. Then there is an observable X such 
that a simple measurement of X also achieves the bound uniformly, and the model is of the form 

p{9) = c(9) exp{iF(X; 9)}p^ exp{iF(X; 9)} (33) 

for a function F, indexed by 9, of an observable X where c{9) ~ l/tT{po exp{F{X] 9))} , p//g(9) = 
f{X;9) — tr{p{9)f{X;9)}, and f{X\9) — F/g{X;9). Conversely, for a model of this form, a 
measurement of X achieves the bound uniformly. 

Remark 2 (Spin-half case). For spin-half, if the information is positive then the quantum score 
has distinct eigenvalues, since the outcome of a measurement of the quantum score always equals 
one of the eigenvalues, has mean zero, and positive variance. □ 

Theorem 3 (Uniform attainabiUty of quantum Cramer— Rao bound). Suppose the pos- 
itivity and nondegeneracy conditions of the previous theorem are satisfied, and suppose that for 
the outcome of some measurement M a statistic t exists which is for all 9 an unbiased estimator 
of 9 achieving Helstrom's quantum Cramer-Rao bound ^iy\) . Var(t) = I{9)^^ . Then the model is 
actually a quantum exponential model of symmetric type 

p{9) - c(0)exp{i0T}poexp{i0T} (34) 

for some observable T , and simple measurement ofT is equivalent to the coarsening of M according 
to t. 

Proof. The coarsening of the measurement M' = M o t^^ corresponding to t also achieves the 
quantum information bound (|31|) uniformly, i{9;M') = I{9). Apply Theorem |21 to this mea- 
surement and we discover that the model is of the form (|33|) . while (if necessary refining the 
measurement to have rank one components) t can be considered as a function of the outcome of 
a measurement of the observable X , and it achieves the classical Cramer-Rao bound for unbiased 
estimators of 9 based on this outcome. Now the density of the outcome (with respect to counting 
measure on the eigenvalues of X) is found to be c{9) exp{F{x;9))tr{poIl^x=x]}- Hence, up to 
addition of functions of 6* or a; alone, F(x; 9) is of the form 9t{x). □ 
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Example 1121 concerned pure spin-half models given by circles of constant latitude on the 
Poincare sphere. Taking the product of n identical copies of such a model produces a spin-j 
model, with j — parameterised by a circle. It follows from the discussion at the end of 

Example 1 121 (|29|) and the additivity of Fisher information that if such a spin-j model is given by 
a great circle then there is a measurement M such that equality holds in H31|l . 

The basic inequality (|31|l holds also when the dimension of is greater than one. In that case, 
the quantum information matrix I {9) is defined in l|30(l and the Fisher information matrix i{0] M) 
is defined by 

where Ir denotes I /qt etc. Then (|31|) holds in the sense that I{6) — i{9\ M) is positive semi-definite. 
The inequality is sharp in the sense that I{9) is the smallest matrix dominating all i{9;M). 
However it is typically not attainable, let alone uniformly attainable. 

TheoremElcan be generalised to the case of a vector parameter. This also leads to a generali- 
sation of Theorem 13 which is the content of Corollary ^ below. First we give a lemma. 

Lemma 1. Let p . Q ^ ^C^) be a twice differ entiahle parametric quantum model. Then 

{P//jP//r - P//iP//3)P + PiP//rP//j - PlljPh) = 2(p//,;/j - O p, 

where p f/g = (py/i, . . . , p//k) denotes the symmetric quantum score and o denotes the Jordan prod- 
uct. 

Proof. By definition of p/je, we have 

4p/, = 2 {p//,p + pp//i) . 

Differentiating this gives 

Ap/,j = 2 {p//^/JP + P//^P/J + PjP//., + PP//^/J) 

= 2 {Pf/t/jP + PPf/t/j) + P//tPP//3 + P||^P||lP + PPlhPll'i + P|r}PP||^■ 

Since pm — p/ji, this leads to 

{p//jP//^ - P//^P//i) P + P {P//^P//j ~ PibPlh) = 2 { {.P||^!i - Pibh) P + P {P||^li - Plhh) } • 

□ 

Theorem 4. Let p . Q ^0^) &6 twice differentiable parametric quantum model. If 
(i) there is a measurement M with i{9; M) = I {9) for all 0, 
(ii) p{9) > for all 9, 
(Hi) Q is simply connected 

then, for any 9q in Q, there are an observable X and a function F (possibly depending on 9o) such 
that 

p{9) = exp 9)] p{9o) exp{^F{X; 9)} . 

Proof. Since i{9;M) = L{9), it follows from equation (|32|l and (iii) that there are real-valued 
functions ri , . . . , rdime on X x such that 

m{x)pi/^{9) = ri{x, 9)m{x), 

for all 6' in and z/- almost all x. Then 

m{x)p//,{9)p//j{9') = ri(x,9)rj{x,9')m{x) = m{x)p//j(9')p//,{9), 
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for all 6*, 6' in Q and 1 < i, j < dim Q. Integration over X shows that p//i{0) and pffjiO') commute. 
By von Neumann's Theorem, there is an operator X and real- valued functions /i, . . . , /dime on 
M X 9 such that 

p„,{e) = h{x-e). (35) 

Using condition (iii), and the fact that p^j and p//j commute, it follows from the Lemma that 
P//i/j — P//j/i- By condition (iv), H35|) can be integrated to give a function F such that 

p„,{e) = F,,{X;e). 

The result follows by uniqueness of solutions of differential equations. □ 

Corollary 1. // under the conditions of Theorem ^ there exists an unbiased estimator t of 9 
based on the measurement M achieving J^19\). then the model is a quantum exponential family of 
symmetric type with commuting Tr- 

Ve r sions of these results h a ve bee n known for some time; see lYoun'3l)l975|) .lFuiiwara and N aeaokal 
1 199E 1. lAmari and Naeaokal l)200(]() : compare especially our Corollarv (Tl to lAmari and Naeaokal 



200C, Theorem 7.6). and our Theorem0]to parts (I) to (IV) of the subsequent outlined proof in 



lAmari and Naeaokal |200(tIi . Unfortunately the precise regularity conditions and detailed proofs 
seem to be available only in some earlier publications in Japanese. Note that we have obtained the 
same conclusions, by a different proof, in the spin-half pure state case, Example II 21 This indicates 
that a more general result is possible without the hypothesis of positivity of the state. 

The symmetric logarithmic derivative is not the unique quantum analogue of the classical 
statistical concept of score. Other analogues include the right, left and balanced derivatives 
obtained by suitable variants of Hl()|l . Each of these gives a quant um information inequality and 
a quantum Cramer-Rao bound analogous to (|31|l and (|19|l . See iBelavkinI l(l97fil) . There is no 
general relationship between the various quantum information inequalities when the dimension of 
is greater than one. 

In the next subsection we discuss the issue of asymptotic attainability of these and similar 
bounds. 

6.3 Asymptotic Information Bounds 

In classical statistics, the Cramer-Rao bound is attainable uniformly in the unknown parameter 
only under rather special circumstances. On the other hand, the restriction to unbiased estimators 
is hardly made in practice and indeed is difficult to defend. However, we have a richly developed 
asymptotic theory which states that in large samples certain estimators (e.g., the maximum likeli- 
hood estimator) are approximately unbiased and approximately normally distributed with variance 
attaining the Cramer-Rao bound. Moreover, no estimator can do better, in various precise mathe- 
matical senses (the Haiek- LeCam asymptot i c loca l minimax th eorem and co nvolution theorem, for 
instance). Recent work bv lCill and Massail 1 200nl) . surveyed in lCilll l(2001al) . makes a first attempt 
to carry over these ideas to quantum statistics. Sim ilar results have been obtaine d, interestingly, 
with qui t e diffe rent methods, i n a se ries of papers, bv lYounj l)l975|) . lFuiiwara and N aeaoka ( 1 995|) . 
iHavashil l)l997j) . and iHavashil l)l998j) . Anot her very recent approach , using large deviation theory 
rather than central limit theory, is given bv lKevl anc^ ^^rn^i] l|2Q Qlll. The aim of lCill and Massail 
l|20fl0l) was to answer a question first posed bv IPeres and WoottersI (^991): do joint measure- 
ments on a product of identical quantum systems contain more information about the common 
state of the subsystems, than separate measurements? The question was first answered — in the 
affirmative — in a rather specific form, bv lMassar and PopesciJ (Qj)95): they considered for the most 
part just n = 2 copies of a spin- half pure state, in a Bayesian sett ing with a special loss function 
and prior distribution. Work of iBarndorff-Nielsen and OiJ ((200(t) showed that this advantage of 
joint over separate n ieasurements disappear s, for the spin-half pure state example, as n ^ cxd. 

The approach of ICill and Massail l)200(]|) is firstly to delineate more precisely the class of at- 
tainable information matrices i„(0; M) based on arbitrary (or special classes) of measurements on 
the model I^H)! of n identical particles each in the same state p{9). Next, using the van Trees 



28 



inequality, a Bayesian version of the Cramer-Rao inequality, see iGill and LevitJ lll993) . bounds 
on in{0; M) are converted into bounds on the asymptotic scaled mean quadratic error matrix of 
regular estimators of 9. Thirdly, one constructs measurements and estimators which achieve those 
bounds asymptotically. The first step yields the following theorem. 

Theorem 5 (Gill— Massar information bound). In the model l^28\} . one has 

tT{I{0yHn{0;M)/n} < dim(7i:)-l (36) 

in any of the following cases: (i) dim(0) — 1 and dim(7Y) = 2, (ii) p is a pure state, (Hi) the 
measurement M is separable. 

Case (i) follows from the earlier information inequality H31|l from which follows, without any 
further conditions, tr{I{0)-^in{O; M)/n} < dim(6'). The class of separable measurements, see Sec- 
tion l2.2l includes all multilocal instruments, i.e., instruments which are composed of a sequence of 
instruments acting on separate particles, see Section Thus it is allowed that the measurement 
made on particle 2 depends on the outcome of the measurement on particle 1, and even that after 
these two measurements, yet another measurement, depending on the results so far, is made on 
the first particle in its new state, etc. 

In the spin-half case the bound (|36|l is achievable in the sense that for any matrix K such that 
tr{I{9)~^K} < 1, there exists a measurement M on one particle, generally depending on d, such 
that i{d;M) — K. The measurement is a randomised choice of several simple measurements of 
spin, one spin direction for each component of 9. 

Application of the van Trees inequality gives the following asymptotic bound: 

Theorem 6 (Asymptotic information bound). In the model let V{9) denote the limiting 
scaled mean quadratic error matrix of a regular sequence of estimators 9^ based on a sequence of 
measurements A'/„ on n particles; i.e., V"^^ {9) = lim„^oo ^Ee{(0,\ — 0'')(0^ — 0^)}. Then V satisfies 
the inequality 

ii{I{9)-^V{9)-^} < dim(7i)-l (37) 

in any of the following cases: (i) dim(0) = 1 and dim(7Y) — 2, (ii) p is a pure state, (Hi) the 
measurements Mn are separable. 

A regular estimator sequence is one for which the mean quadratic error matrices converge 
uniformly in to a continuous limit. It is also possible to give a version of the theorem in terms 
of convergence in distribution, Hajek-regularity and V the mean quadratic error matrix of the 
limiting distribution, rather than the limit of the mean quadratic error. 

In the spin-half case, this bound is also asymptically achievable, in the sense that for any 
continuous matrix function W{9) such that iY{I {9)~^W {9)~^} < 1 there exists a sequence of 
separable measurements M„ with asymptotic scaled mean quadratic error matrix equal to W . This 
result is proved by consideration of a rather natural two-stage measurement procedure. Firstly, on 
a small (asymptotically vanishing) proportion of the particles, carry out arbitrary measurements 
allowing consistent estimation of 9, resulting in^ preliminary estimate 9. Then on each of the 
remaining particles, carry out the measurement M (on each separate particle) which is optimal in 
the sense that i(9; M) = K = W{9)~'^. Estimate 9 by maximum likelihood estimation, conditional 
on the value of 9, on the outcomes obtained inj;he second stage. For large n, since^0 will then be 
close to the true value of 9, the measurement M will have Fisher information i{9; M) close to that 
of the 'optimal' measurement on one particle with Fisher information i{9,M) = W{9)~^. By the 
usual properties of maximum likelihood estimators, it will therefore have scaled mean quadratic 
error close to W{9). These measurements are not just separable, but multilocal, and within that 
class, adaptive and sequential with each new subsystem being measured only once. 

In the spin-half case we have therefore a complete asymptotic efficiency theory in any of the 
three cases (i) a one-dimensional parameter, (ii) a pure state, (iii) separable measurements. By 
'complete' we mean that it is precisely known what is the set of all attainable limiting scaled mean 
quadratic error matrices. This collection is described in terms of the quantum information matrix 
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for one particle. What is interesting is that when none of these thr ee conditions hold, greater 



asymptotic precision is possible. For instance, iGill and Massaii l|2000|) exhibit a measurement of 
two spin- half particles which, for a completely unknown mixed state (a three-parameter model), 
has about 50% larger total Fisher information (for certain parameter values) than any separable 
measurement on two particles. Therefore if one has a large number n of particles, one has about 
25% better precision when using the maximum likelihood estimator applied to the outcomes of 
this measurement on n/2 pairs of particles, than any separable measurement whatsoever on all n. 
It is not known whether taking triples, quadruples, etc., allows even greater increases of precision. 
It would be valuable to delineate precisely the set all attainable Fisher information matrices when 
non-separable measurements are allowed on each number of particles. 

A similar instance of this phenomenon was called non-locality without entanglement bv lBennett et al' 



l|999al . One could say that though the n particles are not in an entangled state, one needs an 'en- 
tangled measurement', presumably brought about by bringing the particles into interaction with 
one another (unitary evolution starting from the product state) before measurement, in order to 
extract maximal information about their state. The word 'non-locality' refers to the possibility 
that the n particles could be widely separated and brought into interaction through other entan- 
gled particles; see Section|Hlfor further examples of this kind in the context of optimal information 
transmission and in teleportation. 

7 Infinite Dimensional Space 

So far our examples have concerned spin-half systems, for which the dimension of the Hilbert 
space 7i is 2, and occasionally spin-j systems (dimension 2j + 1). In this section we give a survey 
of an important infinite dimensional example. The finite dimensional cases led us to parametric 
quantum statistical models. If the system has an infinite-dimensional Hilbert space, non- and semi- 
parametric quantum statistical models make an entrance. So far, they have been little studied 
from the point of view of modern mathematical statistics, despite their significance in experimental 
quantum physics, especially quantum optics. 

7.1 Harmonic Oscillator 

In this subsection we summarise some useful basic theory, and in the next we consider a basic 
statistical problem. 

The simple harmonic oscillator is the basic model for the motion of a quantum particle in a 
quadratic potential well on the real line. Precisely the same mathematical structure describes 
oscillations of a single mode of an electromagnetic field (a single frequency in one direction in 
space) . A useful orthonormal basis in the latter situation is given by the state- vectors of the pure 
states representing zero, one, two, . . . photons. We denote these state-vectors by |0), |1), |2), . . .. 
This basis is called the number basis. For the simple harmonic oscillator, the pure state with 
state- vector |m) is a state of definite energy 1/2 -|- m units, m = 0, 1, 2, . . . . A pure state with 
state- vector \'ip) = X^Cmlw), where X^kmP = 1, is a complex superposition of these states. A 
mixed state p is a probability mixture over pure states \ip){4'\ with state- vectors \ip)- 

Some key operators in this context, together with their common names, are 





A+\n) = VnTT\n+l) 
A^\n) ^ y/n\ji - 1) 
N\n) = n \n) 



Number 



Annihilation 



Creation 




Position 



(38) 



P = -{A^ - A+)/V2 Momentum 
i 

Xff, — (cos (j))Q + (sin (p) P Quadrature at phase (j) ■ 
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One should observe that 

2^^ ' (39) 

[Q,P]=zl. 

In the simple quantum harmonic oscillator, the state of a particle evolves under the Hamilto- 
nian H ~ ^(Q^ + P^) = iV + ^l; thus the state-vector of a pure state satisfies = e~*^*|i/'(0)), 

and an arbitrary state evolves as p(t) = e~*^*p(0)6'^*- The operators Q and P correspond to 
the position (on the real line) and the momentum of the particle. Indeed, the spectral decom- 
positions of these two operators yield the PProM's of measurements of position and momentum 
respectively. It turns out that for a complex number z — re"^ and the corresponding operator 
(called the Weyl operator) Wz — exp{irXij,), we have e^^^WzC^"^^^ — W^ie^^ or in terms of the 
operator X^, ^^^^ g^t^'t^-^^^ = gitx^+o ^ These relations become especially powerful when we note 
a short cut to the computation of the probability distribution of the measurement of the PProM 
corresponding to an observable X: it is the probability distribution with characteristic function 
trjpe**'^}. Combining these facts, we see that the distribution of the outcome of a measurement 
of position Q on the particle at time t is the same as that of Xt at time 0. In particular, with 
t — 7r/2, measuring P at time has the same distribution as measuring Q at time 7r/2. For future 
reference, define F = e"*^'^/^)''^ and note the relation FP = QF. 

We mention for later reference that the Weyl operators form a projective unitary representation 
of the translation group on the real plane, since these are unitary operators with WzWz' — 
w{z, z')Wz+z' for a certain complex function w of modulus 1, cf. (|23(l . 

In order to derive the probability distributions of outcomes of measurements of the observables 
defined above, it is useful to consider a particular concrete representation of the abstract Hilbert 
space n as Ll{R), that is, the space of complex- valued, Borel measurable, square integrable func- 
tions on the real line. The basis vectors \n) will be represented by normalised Hermite polynomials 
times the square root of the normal density with mean zero and variance half. The observables Q 
and P become rather easy to describe in this representation. At the same time, algebraic results 
from the theory of representations of groups provide further relations between the observables X^, 
N, Q and P. 

Let us define the Hermite polynomials Hn{x), n — 0,1,2 ... , by 

i^„(a;)=e-^(-l)"^e--^ (40) 

It follows that Hn{x) is an n'th order polynomial with leading term (2x)". These polynomials 
can also be defined starting from the simple polynomials (2a;)", n = 0, 1, 2, . . . by Gram-Schmidt 
orthogonalisation with respect to the normal density with mean and variance 1/2, n{x) — 
{\/ ^/ti) exp(— x^). Now if X is normal with mean zero and variance half, then E(i/„(X)^) — 2"n!. 
Normalising, we obtain the following orthonormal sequence zi„ in the space L^(M): 



^n{x) = ^j'^,H^{x). (41) 

This sequence is not only orthonormal but complete — it forms a basis of Lj.(M). The functions m„ 
satisfy the following recursion relations 

V2 XUn{x) = \/n + \ Un+l{x) + -JnUn-l{x) 

-^Un{x) = V2^/n^i„_l(a;) - xu„(x) . 
ax 

This shows us that under the equivalence defined by \n) < — > Un, one has the following correspon- 
dences 

Q = {A- +A+)/V2 < — > x 
P=i(A--A^)/^/2_ i-i (42) 

d2 



2iV + 1 = Q2 + p2 



f 2 Q 

r dx2 
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where, on the first Une, by 'a;' we mean the operator of multiplication of a function of a; by x to 
obtain a new function. In this representation the operator Q has 'diagonal' form, corresponding 
to the PProM with element n(_B), B a Borel set of the real line, being the operator 'multiply by 
the indicator function l^'. Thus for a pure state with state- vector \ip) in represented by the 
wave- function x i— > "0(2^) iii ^c(-'^)' probability that a measurement of Q takes a value in B is 
equal to = /g |V'(a;)pda;, so that the outcome of the measurement has probability density 

Moreover, 



By expanding an arbitrary wave function ip as a, series of coefficients times w„, one sees from this 
that the operator F = e"*''^/^-'^ — (— i)^ is nothing else than the Fourier transform, and its 
adjoint F* is the inverse Fourier transform. The relation FP = QF between Q and P involving 
F tells us that the probability distribution of a measurement of momentum P on a particle in 
the pure state with state- vector \tp) has density equal to the absolute value of the square of the 
Fourier transform of the wave function tp{^)- Measurement of Q is further studied in Example 
[njn Appendix I A. II 

More generally, for the observable and considering mixed states instead of pure, from 
gi^AfgitQg-i^jv _ gitjf^ Qj-^g ^^y. (jej-jve the following expression for the probability density of a 
measurement of on a system in state p: 



where Pm,m' = {'m\p\m'). The sense in which this double infinite sum converges is rather delicate; 
however, if only finitely many matrix elements Pm.m' are non-zero, the formula makes sense as it 
stands. 

7.2 Quantum Tomography 

In this subsection we discuss a statistical problem, called for historical reasons quantum tomog- 
raphy^ conce rning the observa bles introduced in the previous subsectio n. Some key references 
are the book iLeonhardtl l)l997j) and the survey papers iD'Arianol ()997allJl . though there has been 
much further progress since then. In its simplest form, the problem of quantum tomography 
is: given independent observations of measurements of the quadrature at phase 4>, X^, with (f> 
drawn repeatedly at random from the uniform distribution on [0, 27r], reconstruct the state p. In 
statistical terms, we wish to do nonparametric estimation of p from n independent and identi- 
cally distributed observations {(j)i,Xi), with 0^ as just described and Xi from the density with 
4> = 4>i. In quantum optics, measuring a single mode of an electromagnetic field in what is called a 
quantum homodyne experiment, this would be the appropriate model with perfect photodetectors. 
In practice, independent Gaussian noise should be added. 

Recalling that the probability density of a measurement of X^ has tr{pe'*^™'"^'^+''™'^^-'} as 
its characteristic function, we note that if Q and P were actually commuting operators (they are 
not!) then the joint characteristic function of a measurement of the two simultaneously would 
have been the function tr{pe*^*'^+*'^''} of the two variables {s,t). 

Now the latter may not be the bivariate characteristic function of a joint probability density, 
but it is the characteristic function of a certain function called the Wigner function. This function 
Wp{q,p) is known to characterise p. It is a real- valued function, integrating to 1 over the whole 
plane, but generally taking negative as well as positive values. The relation between the charac- 
teristic function of a measurement of X^ and the characteristic function of the Wigner function 
which we have just described, shows that the probability density of a measurement of X^ can be 
computed J:rqm the Wigner function by treating it as a joint probability density of two random 
variables Q, P and computing from this density the marginal density of the linear combination 
cos 0(5 -|- sin0P. Now this computation is nothing else than a computation of the Radon trans- 
form of Wp{q,p): its projection onto the line (cos0)g-|- (sin0)p = 0. This transform is well known 




(43) 




(44) 
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from computer-aided tomography, when for instance the data from which an X-ray image must 
be computed is the collection of one-dimensional images obtained by projecting onto all possible 
directions. Thus from the collection of all densities Pp{x]<j)) of measurements of X^, one could 
in principle compute the Wigner function Wp{q,p) by inverse Radon transform, from which one 
can compute other representations of p by further appropriate transformations. In particular, a 
double infinite integral over (p, q) of the product of the Wigner function with an appropriate kernel 
results in p in the 'position' representation, i.e., as the kernel of an integral transform mapping 

into L^. Not all states can be so represented, but at least all can be approximated in this 
way. A further double infinite integral over {x,x') of another kernel results in p in the 'number' 
representation, i.e., the elements Pm,m' ■ 

The basic idea of quantum tomography is to carry out this sequence of mathematical trans- 
formations on an empirical version of the density pp{x;(j)) obtained by some combination of 
sm oothing and binning of the observations {(j)i,Xi). This theoretical possibility was discovered 
bv lVogel and RiskerJ l|l989^ . and first carried out experimentally by M.G. R aymer and colleagues 
in path-breaking experiments in the early 1990's, see Smithcv et al] l|l993l) . Despite the enthusi- 
asm with which the initial results were received, the method has a large number of drawbacks. 
To begin with, it depends on some choices of smoothing parameters and/or binning intervals, 
and later, during the succession of integral transforms, on truncations of infinite integrals among 
other numerical approximations. It has been discovered that these 'smoothings' tend to destroy 
precisely the interesting 'quantum' features of the functions being reconstructed. The final re- 
sult suffers from both bias and variance, neither of which can be evaluated easily. Inverting the 
Radon transform is an ill-posed inverse problem and the whole procedure needs massive numbers 
of observations before it works reasonably well. 

In the mid 1990's G.M. D'Ariano a nd his coworkers in Pavia have discovered a fascinating 
method to short-cut this approach, see iD'Arianol l)997allJ l. Using the fact that that the Weyl 
operators introduced above form an irreducible projective representation of the translation group 
on R^, they derived an elegant 'tomographic formula' expressing the mean of any operator A (not 
necessarily self-adjoint), i.e., tr(pA), as the integral of a function (depending on the choice of A) 
of X and 0, multiplied by Pp{x; (/)), with respect to Lebesgue measure on R x [0, 2^]. In particular, 
if we take the operator A to be \m'){m\ for given (m, m'), we have hereby expressed Pm,m' as the 
mean value of a certain function, indexed by (m,m'), of the observations {(f>i,Xi), as long as the 
phases are chosen uniformly at random. 

The key relation of their approach is the identity 



which can be derived (and generalised) with the theory of group representations. From this follows 



The left hand side is the mean value of interest. The first 'trace' on the right hand side is a 
known function of the operator of interest A and the variable z. In the second 'trace' on the right 
hand side, after expressing z = re*'^ in polar coordinates, we recognise the characteristic function 
evaluated at the argument r of the probability density of our observations Pp{x;<p). Writing the 
characteristic function as the integral over x of e*''^ times this density, transforming the integral 
over z into integrals over r and </>, and reordering the three resulting integrals, we can rewrite the 
right hand side as 



The innermost integral can sometimes be evaluated analytically, otherwise numerically; but in 
either case we have succeeded in our aim of rewriting means of operators of interest as means of 
known kernel functions of our observations. In the case A = |m')(m|, of interest for reconstructing 
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Pm,m' , the kernel turns out to be bounded and hence we obtain unbiased estimators of the Pm,m' 
with variance equal to 1/n times some bounded quantities. 

Still this approach has its drawbacks. The required kernel function, in the case of reconstructing 
the density in the number representation, is highly oscillatory and even though everything is 
bounded, still huge numbers of observations are needed to get informative estimates. Also, the 
unbiased estimators constructed in this way are not unique and one may wonder whether better 
choices of kernels can be found. However, the approach does open a window of opportunity for 
further mathematical study of the mapping from Pp{x] (p) to Pm,m' which could be a vital tool for 
developing the most recent approach, which we now outline briefly. 

As we made clear, the statistical estimation problem seems related to the problems of non- 
parametric curve estimation, or more precisely, estimation of a parameter lying in an infinite 
dimensional space. Modern experience with such problems has developed an arsenal of methods, 
of which penalised and sieved likelihood, and nonparametric Bayesian methods, hold much promise 
as 'universal' approaches leading to optimal methods. In the present context, sieved maximum 
likelihood is very natural, since truncation of the Hilbert space in the number basis leads to finite 
dimensional parametric models which can in principle be tackled by maximum likelihood. One 
can hope that, from a study of the balance between truncation error (bias) and variance, it would 
be possible to derive data-driven methods to estimate p optimally with respect to a user-specified 
loss functi on. So far, only the in itial steps in thi s resea rch programme have been taken; in re- 
cent work iBanaszek et al.l l)2000ll and IParis et alJ l)200l|) have shown that maximum likelihood 
estimation of the parameters in the density H44|l is numerically feasible, after the number basis 
{|m) : TO = 0, 1, . . . } is truncated at (e.g.) m = 15 or m = 20. This means estimation of about 
400 real parameters constrained to produce a density matrix. Numerical optimisation was used 
after a reparameterisation by writing p = TT* as the product of an upper-triangular matrix and 
its adjoint, so that only one constraint (trace 1) needs to be incorporated. We think that it is a 
major open problem to work out the asymptotic theory of this method, taking account of data- 
driven truncation, and possibly alleviating the problem of such a large parameter-space by using 
Bayesian methods. The method should be tuned to the estimation of various functionals of p of 
interest, and should provide standard errors or confidence intervals. 

The quantum statistical model introduced above is that of optical homodyne measurements. 
There is also an elegant mathematical model for another experimental set-up called heterodyne 
measurement. In this case the measurement is a generalised measurement or OProM, and it can 
be realised by taking the product of the Hilbert space of the system of interest with another 
infinite dimensional system, in its ground state. Write Q', P' for the position and momentum 
operators on the ancillary system. It turns out that P + P' and Q — Q' commute, and therefore 
could in principle be measured simultaneously. A joint measurement of the two is a realisation of 
a heterodyne measurement. As an OProM it is invariant under the rotation group (corresponding 
to the phase changes (j) of the homodyne measurement) and under a certain parametric model 
for the state, called the Gaussian or coherent state of the harmonic osc illator, possesses some 
decision-theoretic optimahty properties because of this, see iHolevol l)l982l) . The pair now form a 
quantum transformation model in the sense of Section [4.2l 

The field of quantum tomography is rapidly developing, with some of the latest (not yet 
published) results from the Pavia group of G.M. d'Ariano being quantum holographic methods 
to estimate not an unknown state, but an unknown transformation of a state (i.e., a completely 
positive instrument with trivial outcome space). 

8 From Quantum Probability to Quantum Statistics 

A recurring theme in this section is the relation between classical and quantum probability and 
statistics. This has been a matter of heated controversy ever since the discovery of quantum 
mechanics. It has mathematical, physical, and philosophical ingredients and much confusion, 
if not controversy, has been generated by problems of interdisciplinary communication between 
mathematicians, physicists, philosophers and more recently statisticians. Authorities from both 
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physics and mathematics, perhaps starting with iFevnmanI l)l95ll) . have promoted vigorously the 
standpoint that 'quantum probabihty' is something very different from 'classical probabilit y'. Most 
recently, in two papers on Bell's inequality (which we discuss in Section I8.2|l Acc ardi and Reeolil 
I 2000alhfl . state "the real origin of the Bell's inequality is the assumption of the applicability 
of classical (Kolmogorovian) probability to quantum mechanics" which can only be interpreted 
as a categorical state r nent that classical probability is not applicable to quantum mechanics. 
iMallev and HornsteinI l(l99.ll) conclude from the perceived conflict between classical and quantum 
probability that 'quantum statistics' should be set apart from classical statistics. 

We disagree. In our opinion, though fascinating mathematical facts and physical phenomena 
lie at the root of these statements, cultural preconceptions have also played a role. Statistical 
problems from quantum mechanics fall definitely in the framework of classical statistics and the 
claimed distinctions have retarded the adoption of statistical science in physics. The phenomenon 
of quantum entanglement in fact has far-reaching technological implications, which are easy to 
grasp in terms of classical probability; their development will surely involve statistics too. 

In the first subsection we discuss, from a mathematical point of view, the distinction between 
classical and quantum probability. Next, we consider physical implicati ons of the probabilistic 
predictions of quantum mechanics through the celebrated example of the iBelJ l|l964l) inequalities 
and the Aspect et al. (1982a b) experiment. We appraise the 'classical versus quantum' question 
in the light of those implications. Finally we review a number of controversial issues in the 
foundations of quantum physics (locality, realism, the measurement problem) and sketch the basics 
of quantum teleportation, emphasizing that emerging quantum technology (entanglement-assisted 
communication, quantum computation, quantum holography and tomography of instruments) 
aims to capitalise on precisely those features of quantum mechanics which in the past have often 
been seen as paradoxical theoretical nuisances. 



8.1 Classical versus Quantum Probability 

Our stance is that the predictions which quantum mechanics makes of the real world are stochas- 
tic in nature. A quantum physical model of a particular phenomenon allows one to compute 
probabilities of all possible outcomes of all possible measurements of the quantum system. The 
word 'probability' means here: relative frequency in many independent repetitions. The word 
'measurement' is meant in the broad sense of: macroscopic results of interactions of the quantum 
system under study with the outside world. These predictions depend on a summary of the state 
of the quantum system. The word 'state' might suggest some fundamental property of a partic- 
ular collection of particles, but for our purposes all we need to understand under the word is: a 
convenient mathematical encapsulation of the information needed to make any such predictions. 
Some physicists argue that it is meaningless to talk of the state of a particular particle, one can 
only talk of the state of a large collection of particles prepared in identical circumstances; this is 
called a statistical ensemble. Others take the point of view that when one talks about the state of 
a particular quantum system one is really talking about a property of the mechanism which gener- 
ated that system. Given that quantum mechanics predicts only probabilities, as far as real- world 
predictions are concerned the distinction between on the one hand a property of an ensemble of 
particles or of a procedure to prepare particles, and on the other hand a property of one particular 
particle, is a matter of semantics. However, if one would like to understand quantum mechanics 
by somehow finding a more classical (intuitive) physical theory in the background which would ex- 
plain the observed phenomena, this becomes an important issue. It is also an issue for cosmology, 
when there is only one closed quantum system under study: the universe. 

It follows from our standpoint that 'quantum statistics' is, for us, classical statistical inference 
about unknown parameters in models for data arising from measurements on a quantum system. 
However, just as in biostatistics, geostatistics, etc., etc., many of these statistical problems have 
a common structure and it pays to study the core ideas and common features in detail. As we 
have seen, this leads to the introduction of mathematical objects such as quantum score, quantum 
expected information, quantum exponential family, quantum transformation model, and so on; 
the names are deliberately chosen because of analogy and connections with the existing notions 
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from classical statistics. 

Already at the level of probability (i.e., before statistical considerations arise) one can see 
analogies between the mathematics of quantum states and observables on the one hand, and clas- 
sical probability measures and random variables on the other. This analogy is very strong and 
indeed mathematically very fruitful (also very fruitful for mathematical physics). Note that col- 
lections of both random variables and operators can be endowed with algebraic structure (sums, 
products, . . . ). It is a fact that from an abstract point of view a basic structure in probability 
theory — a collection of random variables X on a countably generated probability space, together 
with their expectations J XdP under a given probability measure P — can be represented by a 
(commuting) subset of the set of self-adjoint operators Q on a separable Hilbert space together 
with the expectations trjpQ} computed using the trace rule under a given state p. Thus: a ba- 
sic structure in classical probability theory is isomorphic to a special case of a basic structure 
in quantum probability. 'Quantum probability', or 'noncommutative probability theory' is the 
name of the branch of mathematics which studies the mathematical structure of states and ob- 
servables in quantum mechanics. From this mathematical point of view, one may justly claim that 
classical probability is a special case of quantum probability. The claim does entail, however, a 
rather narrow view of classical probability. Moreover, many probabilists will feel that abandoning 
commutativity is throwing away the baby with the bathwater, since this broader mathematical 
structure has no analogue of the sample outcome w, and hence no opportunity for a probabilist's 
beloved probabilistic arguments. We discuss Quantum Probability further in Section 19.11 under 
the heading of Quantum Stochastic Processes. 

8.2 Bell, Aspect, et al. 

We now discuss some physical predictions of quantum mechanics of a most striking 'nonclassical' 
nature. Many authors have taken this as a defect of classical probability theory and there have been 
proposals to abandon classical probability in favour of alternative theories (negative, complex or p- 
adic probabilities; nonmeasurable events; noncommutative probability; . . . ) in order to 'resolve the 
paradox'. However in our opinion, the phenomena are real and the defect, if any, lies in believing 
that quantum phenomena do not contradict classical physical thinking. This opinion is supported 
by the recent development of (potential) technology which acknowledges the extraordinary nature 
of the predictions and exploits the discovered phenomena (teleportation, entanglement-assisted 
communication, and so on). In other words, one should not try to explain away the strange 
features of quantum mechanics as some kind of defect of classical probabilistic thinking, but one 
should use classical probabilistic thinking to pinpoint these features. 

Consider two spin-half particles, for which the customary state space is 7i = . Let |0) 
and |1) denote the orthonormal basis of corresponding to 'spin up' and 'spin down', thus two 
eigenvectors of the Pauli spin matrix cr^. We write \ij) as an abbreviation for \i) (g) \ j), defining 
four elements of an orthonormal basis of our H. 

For u in 5*^, let cr^ = u^Cx + UyUy + Uzffz, the observable 'spin in the direction w' for one spin- 
half particle. It has eigenvalues ±1 and its eigenvectors are the state- vectors tp{±u) corresponding 
to the directions ±{t in S"^. The appropriate model for measurement of spin in direction H on the 
first particle and spin in the direction v on the second particle is a joint simple measurement of 
the two compatible observables (g) 1 and 1(E) a^j (see Example 0). The possible outcomes ±1, ±1 
correspond to the one-dimensional subspaces spanned by the four orthogonal vectors ip{zLu) E) 

Now if the state of the system is a tensor product pi (g) p2 of separate states of each particle, 
then one can directly show that the outcomes for particle 1 and particle 2 are independent, 
and distributed as separate measurements on the separate particles, as one would hope. If the 
joint state is a mixture of product states, then the outcomes will be distributed as a mixture of 
independent outcomes. For an entangled state, the outcomes can be even more heavily dependent. 

Consider the entangled pure state with state vector {|10) — |01)}/-\/2. This state is often called 
the si nglet or Bell state. Straightforward calculations, see for instance iBarndorff-Nielsen et al.l 
l|2002() . show that for this state the two spin measurements have the following joint distribution: 
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the marginal distribution of each spin measurement is Bernoulli(i), the probabihty that the two 
outcomes are equal (both +1 or both — 1) is ^(1 — u • w). In particular, if the two measurements 
are taken in the same direction, then the two outcomes are different with probability 1; in the 
opposite direction, the two outcomes are always the same; in orthogonal directions the probability 
of equality is i so, taking account of the marginal distributions, the two outcomes are independent. 

The singlet state is an appropriate description for the spins of two spin-half particles produced 
simultaneously in some nuclear scattering or decay processes where a total spin of is conserved. 
The two particles have exactly opposite spin, which seems reasonable. The two particles are 
together in a pure state, which is also reasonable if the process involved was a Schrodinger evolution 
starting from a pure state. The model also exhibits a rotational invariance. These are all good 
reasons to expect the model to be not just a hypothetical possibility but a real possibility (and 
indeed, it is). 

Fix a special choice of two possible different values of u and two possible different values of v. 
Let us suppose that all four directions are in the same great circle on S'^ and let ui and U2 be in 
the directions 0° and 120°, let vi and V2 be in the directions 180° and 60°. Since cos(60°) = ^ we 
see that: when the directions are the pair (0°, 180°) then the probability the two spins are found 
to be equal is 1; but when the directions are any of the three pairs (0°,60°) or (120°, 180°) or 
(120°, 60°) the two spins are found to be equal with probability j. Is this surprising? 

Consider an experiment where pairs of particles are generated in the singlet state, and then 
made to travel to two far- apart locations, at each of which spin is measured in one of the two 
directions just specified. Suppose the experiment is repeated many times, with random and 
independent choice of the two directions for measurement at each of the two locations. We have just 
computed the probabilities of all possible outcomes under each of the four possible combinations 
of directions. 

Let us try to simulate the predicted statistics of the experiment using classical objects. To 
be very concrete, consider two people who try to simulate two spin-half particles. They start in 
a room together but then leave by different doors. Outside the room they are separately told a 
direction, ui or U2 for person 1, vi or V2 for person 2, and asked to choose an outcome '+1' or 
'— 1'. They are not allowed to commimicate any more once they have left the room. Moreover 
the directions will be chosen independently and randomly. The whole procedure will be repeated 
many many times and their aim is to simulate the quantum probabilities stated above. The two 
persons obviously will need randomisation in order to imitate the randomness of spin-half particles. 
We allow them to toss dice or coins, in any way they like, and to do this together in the room 
before leaving. They can simulate in this way any degree of dependence or independence they 
like. Let us call the outcome of their randomisation process lo. Their strategy will then be two 
pairs of functions of oj, with values ±1, which determine the answers each person would give when 
confronted with each of his two directions on leaving the room, when the randomisation produces 
the outcome uj. 

This whole set-up defines four Bernoulli ±l-valued random variables, let us call them Xi, 
X2, Yi, Y2; the X variables for person 1 and the Y variables for person 2. The four must be 
such that any pair Xi, Yj has the same joint distribution as the result of measuring spins in the 
directions Ui and Vj . Now it is easy to check that since these four variables are binary, Xi 7^ Y2 
and Y2 ^ X2 and X2 ^ Yi implies Xi ^ Yi (just fill in -fl, -1, +1, -1 for Xi, Y2, X2, Yi in order; 
or alternatively —1, -1-1, —1, -1-1.) Conversely, therefore, Xi = Yi implies Xi — Y2 or Y2 — X2 or 
X2 — Yi. Therefore we have 

P(^i = Yi) < P{Xi =. Y2) + P(Y2 = X2) + V{X2 = Yi). 

But the four probabilities we are trying to simulate are 1, -j, |;, ^ and it is not true that 1 < 
\ + \ + \- Therefore it is not possible to simulate with classical means (people or computers or 
other classical physical systems) the predicted outco mes o f meas urements of two spin-half particles! 

The inequality we have just derived is due to iBelll l|l964|) who contrasted it with the pre- 
diction of quantum mechanics in order to prove the failure, a priori, of any attempt through 
the introduction of hidden variables to explain the randomness of outcomes of measurements of 
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quantum systems through 'mere statistical variation' in not directly observed and uncontrollable 
(hence hidden) properties of the quantum systems or measurement devices. He assumed that any 
physically meaningful hidden variables model would satisfy the physically reasonable property 
of locality, that is to say, the outcome of a measurement on one particle in one location should 
not depend on the measurement being carried out simultanously on the other particle in another 
distant location. Inspection of the argument we have given shows that Bell's inequality is not 
due to our slavish adherence to classical probability, but simply through the assumption that the 
outcome of a measurement on one particle should not depend on which measurement is being made 
on the other particle. This is reason enough for some authors, for instance .Mau dlin (1994), to 
conclude that Bell's argument shows that the predictions of quantum mechanics violate locality; 
he goes on to study the possible conflicts with relativity theory and concludes that there is no 
conflict in the sense that this phenomenon does not violate the requirements that cause and effect 
should not spread faster than the speed of light, and there is not a conflict with the basic rela- 
tivistic (Minkowski) invariance property. Thus quantum mechanics lives in uneasy but peaceful 
coexistence with relativity theory. 

All this would be purely academic were it not the case that the model we have just described 
truly is appropriate in certain physical situations and the predictions of quantum theory have 
been experi mentally verified; first by Alain Aspect and his coworkers in a celebrated experiment 
(reported in lAspect et alJl982albl) in Orsay, Paris, where polarisation of pairs of entangled photons 
emitted from an excited caesium atom was measured with polarisation filters several metres apart; 
the orientation of the filters being fixed independently and randomly after the photons had been 
emitted from the source and before they arrived at the polarisation filter. (Polarisation of photons 
has a very similar mathematical description to spin of spin-half particles, except that all angles 
need to be halved; entangled photons have equal behaviour at polarisation filters oriented 90° to 
one another.) More recently, the experiment has been done on the glass fibre network of Swiss 
telecom with the two filters being 10 km apart on different shores of Lake Geneva. 

Our conclusion is that quantum mechanics makes extraordinary physical predictions, predic- 
tions which are properly stated and interpreted in the language of classical probability. Techno- 
logical implications of these predictions are only just beginning to be explored. One proposal is 
entanglement-assisted communication, see Bennett ct al. (999b, 2001); Holevo (2001b). Suppose A 
would like to send a message to B by encoding the message in the states of a sequence of spin-half 
particles transmitted one by one from A to B. At the receiving end B carries out measurements 
on the received particles on the basis of which he infers the message. Obviously the results will 
be random, especially if the communication channel suffers from noise, of classical or quantum 
nature. Using the theory of instruments one can describe mathematically all physically possible 
communication channels and all physically possible decoding (measurement) schemes, and com- 
pute analogously to classical information theory the maximum rate of transmission of information 
through the channel. Suppose now A and B allow themselves a further resource for communica- 
tion. In between A and B a third person C is located, and he sends A and B simultaneously pairs 
of entangled spin half particles, in step with the transmission of particles from A to B. 'Obviously' 
there is no way these particles can be used to transmit information from A to B. They come from 
a different source altogether and are created in a fixed and known state. Yet it turns out that if 
A uses one part of the entangled pair in his encoding step with each particle he transmits, and B 
uses the other part of the pair in his decoding step, the rate of transmission can be doubled. 

These extraordinary results show that it would be foolish to 'explain away' the phenomenon 
discovered by Bell by turning to some exotic probability theory (though many authors have done 
precisely this!). On the contrary, the mathematics — using classical probability — shows that strange 
things are going on and indeed it seems likely that one will be able to harness them in future 
technology. 

8.3 Teleportation 

As an example we show how the singlet state of a pair of spin-half particles, supposed to be 
in two distant locations, can be used to transmit a third spin-half state from one location to 



38 



the other. This scheme was invented bvlBenn ett et al .l 11199311 an d experimentally carried out by 
A. ZeiHnger's group in Innsbruck, see ^Bouw meester et alJ lll997l). For a recent survey including 
references to the results of other experimental groups see lBouwmeester et al. (,200 li) . The method 
illustrates how quantum technology (e.g., computation) will combine the basic ingredients of simple 
measurements, unitary evolution, and entanglement (product systems). The state being teleported 
is supposed to be completely unknown. This means that any attempt to measure it, and then 
teleport it by communicating in a classical way the results of measurement, cannot succeed, since 
the outcomes will be random, do not determine the initial state, and the initial state will have 
been d estroyed by the measurement. The no-cloning theorem ofWoottcrs and Zurck (1982), Dicks 
l|l982|) shows that there is no instrument which can transform a state p together with an ancillary 
quantum system into two identical copies p® p. 

Consider a single spin-half particle in the pure state with state- vector a|l) + j3\Q). It is brought 
into interaction with a pair of particles in the singlet state so that the whole system is in the 
pure state with state- vector, after multiplication of the tensor product, and up to a factor l/-\/2, 
ajllO) — ajlOl) 4- /3|010) — /3|001). The three particles are here written in the sequence: particle 
to be teleported, first entangled particle at the source location, second entangled particle at the 
destination location. Now we introduce the following four orthogonal state-vectors for the two 
particles at the source location, neglecting another constant factor 1/V2, $i = |10) — |01), $2 = 
|10) -t- |01), "Hi = |11) + |00), *2 = |11) - |00), and we note that our three particles together 
are in a pure state with state- vector which may be written (up to yet another factor, l/-\/4) 
^-1 ® (a|0) - /3|1)) + ^2® (a|0) + /3|1)) + ^i® (-a|l) - /3|0)) + $2 (-a|l) + ^|0)). So far 
nothing has happened at all: we have simply rewritten the state-vector of the three particles as a 
superposition of four state- vectors, each lying in one of four orthogonal two-dimensional subspaces 
of (g) ig) C^: namely the subspaces $1 (g) C^, $2 ® C^, ® and ^'2 ® C^. 

To these four subspaces corresponds a simple instrument. It only involves the two particles 
at the source location and hence may be carried out by the person at that location. He obtains 
one of four different outcomes, each with probability i, so he learns nothing about the particle 
to be teleported. However, conditional on the outcome of his measurement, the particle at the 
destination is in one of the four pure states with state- vectors a\Q) — (3\\), a|0)-|-/3|l), — a|l)— /3|0), 
— ajl) -|- /3|0). The mixture with equal probabilities of these four states is the completely mixed 
state p — il, so nothing has happened at the destination: the state of the second part of the 
entangled pair still is in its original (marginal) state. But once the outcome of the measurement 
at the source is transmitted to the destination (two bits of information, transmitted by classical 
means), the receiver is able by means of one of four unitary transformations to transform the 
resulting pure state into the state with state- vector ajO) -I- /3|1): teleportation is succesful. Neither 
source nor destination learn anything at all about the particle being transmitted by this procedure. 
If the state being teleported was a mixture, then decomposing it into pure components which are 
teleported independently and perfectly shows that the final destination state is the same mixture. 
In short, by transmitting two classical bits of information we are able to copy a point in the unit 
ball (specified by three real numbers) from A to B, without learning anything about the point at 
all in the process. 

8.4 The Measurement Problem 

We summarise here the problem raised by Schrodinger's cat, and survey briefly some responses. 
Consider a spin-half particle in the pure state with state-vector a\Q) -f /3|1), where |ap -I- = 1. 
Suppose a measurement is made of the PProM with elements {|0)(0|, |1)(1|}, resulting in the 
outcomes and 1 with probabilities jap, Next to the measurement device is a cage containing 
a cat and a closed bottle of poison. If the outcome is 1, an apparatus automatically releases the 
poison and the cat dies. Otherwise, it lives. We suppose this whole system is enclosed in a large 
container and isolated from the rest of the universe. 

Now the contents of that container are themselves just one large quantum system, and pre- 
sumably it evolves unitarily under some Hamiltonian. If a = 0, the final situation involves a 
dead cat. Let us denote its state- vector then by |dead). If /3 = then the final state of the 
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cat has state- vector | alive). So by linearity, in general the final state of the cat has state- vector 
a I alive) -I- /3|dead). How would the cat experience being in this state? 

When the container is opened and we look in, presumably a measurement does take place of 
the state of the cat, and at that moment (and only at that moment) it collapses into one of the 
two states with state- vectors | alive), |dead) with the probabilities |ap, |/3p. Recently, a number 
of experiments have been done which are purported to produce Schrodinger cats, in the sense 
of quant um superpositions of macroscopically distinct physical states of physical systems. For 
instance. iM ooii et alJ l(l999f ) report on an experiment in which an electronic current involving of the 
order of a billion (10^) electrons flows in a superposition of clockwise and anticlockwise directions 
around a supercooled alumuminium ring o f a few micro meters in diameter (a thousand times larger 
than a typical molecular dimension ). See lciiilil2noibt) for a discussion of this experiment and of 
the role of quantum statistics in confirming its success. 

The situation is made more complicated when another person, known in the literature as 
Wigner's friend, is included in the system. He is in a room together with the container and at 
some point looks in the container. Only later does he report his findings to us. 

This weird story accentuates some strange features of quantum mechanics. We told it as if 'the 
state' of a quantum system is something with physical reality, as it were, 'engraved' in the particles 
constituting the system. This idea leads us to suppose states exist which are very hard to imagine, 
and never observed in the real world. We see that the 'collapse of the state-vector', supposed to 
occur when a measurement takes place, seems to contradict the fact that measurement devices 
are physical systems themselves, and the device and the system being measured should evolve 
unitarily, not suddenly jump randomly from one state to another. We see that the dividing line 
between quantum system and the outside world is completely arbitrary, yet plays a central role in 
the theory (separating deterministic unitary evolution from random state-collapse). 

Many different standpoints can be taken on these issues. The most extreme are those of the 
empiricist (or instrumentalist, or pragmatician) on the one hand, and the realist (who is actually 
an idealist) on the other. The empiricist does not believe in some kind of physical reality behind 
observed facts. He is interested only in making correct predictions about observable features of 
the world. For this person the only problem in our story is that the dividing line between quantum 
system and classical environment is somewhat arbitrary. If different descriptions lead to different 
prescriptions, there is a problem with the mathematical model. Below we present a simplified 
version of a consistency argument, which aims to show that there is no conflict between the two 
ingredients of quantum theory, and no inconsistency when the Heisenberg divide between quantum 
system and outside world may be placed at several different places. 

Very similar considerations as those used in the consistency argument are also often used 
to argue that the von Neumann (random) collapse of the wave function can be derived from 
(deterministic) Schrodinger evolution. However we are inclined to believe that such claims are 
incomplete. If one believes that the state of things in the world is described by wave- functions, one 
still has a problem in relating wave- functions to physical properties of real objects. This problem 
is supposedly addressed by Everett's many worlds theory, van Fraassen's modal interpretation, 
and Griffiths' and Omnes' theory of consistent histories, among others. We find none of these 
attempts to make von Neumann redundant very convincing. However, the realist who wants the 
wave-function to be actually there in reality, and who believes that the true dynamics of physical 
systems is according to Schrodinger's equation alone, is forced in this direction. For cosmologists, 
wanting to model the whole universe without external observer, there seems to be a problem, since 
quantum randomness is a key part of modern theories of the origin of the universe. 

The alternative for the realist is to extend or alter Schrodinger's dynamics in order to introduce 
a random element, which should make no difference to small quantum systems but should 'simulate' 
the von Neumann collapse, on big ones. Two fairly well explored variants of this idea are Bohm's 
hidden variables model, and the 'continuous spontaneous localisation' model of Ghirardi, Rimini 
and Weber. Most physicists are unhappy about these theories, since their claim to legitimacy is 
essentially that they reproduce unitary evolution and wave-function collapse in the two extreme 
situations where these should hold; 'in between' the physics is too difficult to make predictions, let 
alone test them by experiment. Thus the models do not seem to have new, testable consequences. 
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while they include variables which determine the outcome of measurement, hence must be non- 
local. 

Now we turn to the consistency argument, which aims to show that there is no contradiction 
between Schrodinger evolution and von Neumann collapse, in the sense that placing the dividing 
line between quantum system and outside world at different levels does not lead to different 
conclusions (at least, for an observer who is always in the outside world). This particular version 
was communicated to us by Franz Merkl. 

Consider a spin-half particle which passes through the magnetic field of a Stern-Gerlach ap- 
paratus and then, if its spin is 'up', hits a photographic plate where a chain reaction produces a 
visible spot. If the spin is 'down' suppose the particle is lost. (This is a bit simpler than allowing 
the spin-down particle to hit the photographic plate at a different position: we have to model 
the interaction only in the spin- up case). We will call the photographic plate the detector. If the 
particle starts in the state a|0) + where where |0) and |1) represent spin-up and spin-down 

respectively, and the coefficients a and /? satisfy |ap -t- = 1, we get to see the spot with 
probability |ap. Now the consistency problem arises because we could just as well have consid- 
ered particle plus photographic plate as one large quantum system evolving jointly under some 
Hamiltonian for some length of time. If the detector started off in some pure state, then the final 
joint state of the joint system is another pure state, and no random jump to one of two possible 
final states has taken place. Let us however admit that the large systcmi of the photographic 
plate involves many, many particles, and repetition of the experiment with the whole system in an 
identical pure state is physically meaningless to consider. At each repetition there are myriads of 
tiny differences. Therefore physically relevant predictions are only obtained when we use a mixed 
state as input for the macroscopic system. To make the mathematics even more simple, we will 
suppose that what varies from instance to instance is the length of time of the interaction. Let 
be the state-vector of the detector, before the interaction starts. The joint system starts in the 
pure state with state- vector {a\Q) + (3\1)) ® \tp) . Now the Hamiltonian of the interaction between 
particle and detector must be of the form |0)(0| ® H where H acts on the huge Hilbert space of 
the detector, since there is a change to the detector if the particle starts in the spin-up state, but 
not at all if the particle starts in the spin-down state. Let the length of time of the interaction be 
r. Then the final state of the joint system after the interaction is the pure state with state-vector 
a|0) ® er^^'^ + [3\1) ® \'4)). The corresponding density-matrix can be written out, partitioned 
according to the first component of the joint system, as 

Now suppose we replace Ht by Ht -|- /e where / is the identity matrix. The idea here is that Ht 
must in some sense be large, since it produces a macroscopic change in a large quantum system. 
Thus this is a tiny perturbation of the interaction if e is small, but on the other hand, since h is 
so tiny, e/h can still be very large. As we vary e smoothly over some small interval, e/h varies 
smoothly over a huge range of values, and therefore the fractional part of €/{2Trh) is close to 
uniformly distributed over the interval [0, 1]. Consequently, the factor e^"/'' is close to uniformly 
distributed over the unit circle. Now after we have made this perturbation to the interaction, the 
density matrix of the joint state is 

On averaging over e, the off-diagonal factors disappear and we find the density matrix 

This is the density matrix of the joint system which with probability |ap is in the pure state 
with state-vector |0) (g) e~'^^'^/^\tj)) and with probability |/3|^ is in the pure state with state-vector 
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|1) (g) 1-0). In other words, either a spin-up particle and a detector which indicates a particle was 
detected, or a spin-down particle and a detector which indicates no particle was detected. 

This argument is simple and one can criticise it in many ways. One would prefer to put the 
initial randomness into the many particles making up the detector, rather than into the interaction, 
and it should not have such a special form. But this is not a problem. Much more realistic models 
can be worked through which lead to the same qualitative conclusion: allowing variability in the 
initial conditions of the macroscopic measuring device, of a most innocuous kind, allows random 
phase factors such as e""/'' to wipe out off-diagonal terms in a large density matrix, so that all 
future predictions of the joint system are the same as if a random jump had occured during the 
initial interaction to one of two macroscopically distinct states. 

In conclusion, it seems that as long as one is interested in using quantum mechanics only 
to predict what happens in a small part of the universe, and takes the randomness of quantum 
mechanics as intrinsic, not something which should be explained in a deterministic way, there are 
no logical inconsistencies in the theory. The state vector or state matrix of a quantum system 
should not be thought of as having an objective reality, somehow 'engraved' in the physical nature 
of a single instance of some quantum system, but is rather a characteristic of the preparation of the 
quantum system which, at least conceptually if not actually, could be repeated many times. Thus 
a statistical description goes in, and a statistical description comes out. The working quantum 
physicist even makes do without the von Neumann collapse of a quantum system, on measurement, 
since realistic quantum mechanical modelling of the quantum system under study together with 
the macroscopic measurement device allows one to introduce statistical variation in the initial state 
of the measurement device of the kind we have just described, and this leads irrevocably, it seems, 
to density matrices which are diagonal in the bases expressing macroscopically distinguishable 
states. In other words, unitary evolution alone, starting from the mixed initial state of quantum 
system plus measuring environment, is enough to determine the correct probability distribution 
over macroscopically distinguishable, thus 'real world', outcomes. The working quantum physicist 
is also well aware that the Hamiltonians he uses are only 'effective Hamiltonians' relative to some 
energy cut-off, which in turn corresponds to some approximation of a much larger state space by a 
smaller one. So the concerns of workers in the foundations of physics, worried about whether 'the 
state vector of the universe' evolves in a unitary, deterministic way, or a random, non-unitary way, 
could turn out in the long run to be as purely academic as those of medieval theologians trying 
to calculate how many angels could dance on the head of a pin, since sooner or later physicists 
will learn that quantum mechanics was itself only a limiting case of a better theory, as happened 
to Newtonian mechanics before. If we think about it carefully, we realise that the reality of basic 
concepts of classical physics is as illusory as that of basic concepts of modern physics. 



9 Some Further Topics 

9.1 Quantum stochastic processes 

Since its inception in the early 1980's, through pioneering work of Hudson and Parthasarathy, 
quantum — or noncommutative — probability has grown into a mature and sophisticated mathe- 
matical field. The criticsm which we levelled at the philosphical standpoint of its protagonists 
in Section 18.11 does nothin g to reduce the mat hematical and physical results which have been 
achieved; see, for insta nce. lAccardi et alJ l|l997l) . An excellent introduction to the field h as been 
given by iBianel lll99,4l and a more co mprehensive account is available from the hand of iMeveil 
(|l993f), see also 'Parth asarathvl l)l992() . A new journal Infinite Dimensional Analysis, Quantum 
Probability and Related Topics, now in its fourth year, is home to many of the more recent devel- 
opments. Here we shall summarise briefly some aspects of quantum stochastic processes, under 
several subheadings. 



Quantum optics Quantum optics is one of the currently most active and exciting fields of 
quantum physics, particularly from the viewpoint of the present paper. Laser cooling, on which 
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we comment separately below, is, or may be viewed as, one of the areas in this field. Here we 
discuss briefly the Markov quantum (optical) master equation (MQME) and its quantum stochastic 
differential equation (QSDE) counterparts. 

The Markovian quantum master equation provides an (approximate) description of a wide 
range of quantum system evolutions. The MQME is of the form 

p{t) = L{t)p{t), 

where L{t) is a linear operator. In order for this equation to h ave a solution such that p{t) is a 
density operator for each t, L{t) must be of the Lindblad form (|Lindbladlll97fil) 

Lp^~^[H,p]+Y,[A,pAl - \pAlA, ~ \AlAkp) , (47) 

k 

where H is some Hermitian operator and the Ak are (bounded) operators. To each such operator 
there exists a variety of QSDE's for a process ip{t) with values in Ti such that, writing p{t)_= 
\ip{t)){Mt ) \l{il)i t)U{ t)). we have Efpft)] = p(t) See, for instance. iMolmer and Ca stin (19^1), 
IWisemanI and lGardiner and ZolleJ l|200d Chap. 5). 

Interestingly, the same Markov quantum master equation has turned up in the Ghirardi- Rimini- 
Weber 'continuous spontaneous localisation' approach to the measurement problem, whereby uni- 
tary Schrodinger evolution is replaced by a stochastic differential equation, which is able to mimic, 
according to the circumstances, both purely unitary evolution of a closed quantum system, and 
the von Neumann collapse of the wave function of a quantum system interacting with a large 
(measuring) environment. 

To illustrate how equation H47I) can be numerically calculated by simulating many times a 
QSDE in what is called the quantum Monte Carlo approach, we consider the simplest case, when 
the index k just takes a single value and can therefore be omitted. Moreover, absorb the constant 
h into the Hamiltonian H . We show that the evolution is identical to the mean evolution of 
the following stochastic process for an unnormalised state vector tj): the deterministic but non- 
Hamiltonian evolution 

ip = - iHip ~ -A* Alp 

intcrupted by collapses 

xp ^ Alp 

with stochastic intensity 

/ = IIA^IIVII^II'- 

Introducing a counting process TV with intensity / one can combine these equations into one QSDE 
of jump type, 

dxjj = {-iHiP - ^A*Ai})dt + {A^P - iP)dN. 

Define p = ipip* , the unnormalised random density matrix corresponding to the stochastic evolu- 
tion, and p = p/trp. Note that / = tr(^p^*)/trp = tr(^p^*)/trp. Since dp = dip.ip* + Tp.dTp* 
and xp* = itp*H — ^ip* Atp, the smooth part of the evolution can be rewritten as 

^ = ^t[H,p\^^iA*Ap + pA*A). 

Taking the trace, we find on the smooth part d(trp) = —tr{ApA*). Together, this yields 

dp 1 dp p d(tr p) 
dt ^ tr^dt (tr p)^ dt 

= -z[H,p\~l-{A*Ap + pA*A)+Ip. 
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For the jump part, define N{t) to be the number of jumps in the time interval (0,t]. Then at a 
jump time we can write 

= {^J^ - P-) i<^N - Idt) + (Ap^A* - Ip^)dt. 
Together this gives, at all time points, 

dp (^-i[H,p]~^(A*Ap + pA*A)+ApA*^dt 

Taking the expectation throughout, the martingale part (the second line) of this equation dis- 
appears, and p in the first line is replaced by its expected value which we call p. The resulting 
nonstochastic differential equation for p is precisely ()47|l . Moreover since p was by construction 
a random density matrix (nonnegative, self-adjoint and trace one) we see that the solution p of 
(|47|l . being the expected value of a density matrix, is also a density matrix; something which is 
not obvious from (|47|l . 

Example 13 (Quantum Monte Carlo for spin-half). Consider a two dimensional quantum 
system and choose a basis such that H = -t- i?2|2)(2|, for real numbers Ei and £2- 

These are the two energy levels of the Hamiltonian. Suppose A is diagonal in this basis with 
A\2) = a|l) and A\l) — (the zero vector), where a is real. This is the model for the energy 
of a two- level atom which, on the spontaneous emission of a photon to its environment, can 
decay from its excited state to its ground state. Consider the evolution of an unnormalised 
state -0 = ci|l) -I- C2I2), where ci and C2 are complex functions of time. One discovers, since H 
and A* A are simultaneously diagonalizable, that the smooth part of the evolution decouples as 
ci = {—iEi — ia^)ci, C2 = (— «i?2)ci. Thus starting in state |1) or in state |2), we stay there, as 
long as no collapse occurs. If we are in state |2) collapse has intensity 0. However in state |1) there 
is a constant intensity of collapse to state Thus starting in state |1), the QSDE predicts 
an exponential waiting time of collapse to |2) with rate a^. The reader may like to compute the 
probability distribution of the time to collapse to state |2), starting from an arbitary pure state 
iP^a\l)+f3\2). □ 

As we remarked above, the same Lindblad equation can be represented as the mean evolution 
of a whole range of QSDE's, of jump type, diffusion type, and mixed type. Consider the same 
Lindblad equation as we were discussing above (no summation over fc, drop h). For an arbitrary 
real number p define two matrices D± = {pi ± A)/ \/2. Then the original Lindblad equation can 
be rewritten again in Lindblad form, with two different values of fc, and the corresponding Ak 
being Z)+ and £>_ . This has a Quantum Monte Carlo representation of a smooth evolution tp — 
{—iH— \D+D+* — ^D^D-*)ip, interupted by collapses ^ D±ip with intensities 
The total intensity of jumps can be calculated as /i^ -I- 1| As /x ^ 00 the rate of jumping 

increases without limit, but the relative change in the state at each jump becomes smaller and 
smaller. In the limit (after normalising suitably) one obtains a diffusion representation 

d<j) = (^-iH(l}+^(^(t)*A(j)A-^A*A-^(j)*A*(l34*A(j)'j^dt + ^(2A- 

where W is of course a standard Wiener process. 

Laser cooling The paper bv lMolmer and CastinI l)l99(il) on Monte Carlo techniques, for calcu- 
lating expectation values for dissipative quantum systems, has been instrumental in particular in 
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the context of laser cooling. Laser cooling is a topic of great current in interest in physics, both 
from the theoretical point of view and in terms of experimental advances opening up possibilities 
of studying many basic quantum phenomena, for instance Bose-Einstein condensation. 

For a full understanding of the posssibility of subrecoil cooling, leading physicists were led 
to develop theoretical results that from the viewpoint of probability belong to renewal theory 

and add interesting new results an d problems to that theory. For an introduction to this, see 

iBarndor ff- Nielsen and BenthI ll20 Ql|). A compr e hensive a ccount is given in lBardou. Bouchaiid. Aspect, and Cohen- Tannoi: 
l(2flnil ). iBa rndorff- Nielsen. Benth. and .TensenI ( 2000allJ l present some extensions to the setting of 
(classical) Markov processes. 



Quantum infinite divisibility and Levy processes Several types of quantum analogues 
of infinite divisibility and Levy processes have recently been introduced. Two belong to free 
probability and are mentioned below. Infinitely divisible instrum e nts and associate d instrumental 
processes with independent incr ements are di scussed in iHolevo ' ('2001a"). See also iMeved l|l993l 
Chap. 7) . iBarchielli and Paganoni. (,1996.^ . and lAlbeverio. Rtidiger. and W\yt 



Free probability and random matrices The subject area of free probability evolves around 
the concept of free independence, also termed freeness. The latter was originally introduced 
by Voiculescu in the mid 1980's in a study of free-group von Neumann factors but was shortly 
afterwards realised to be naturally connected to the limiting properties of products of large and 
independent self-adjoint random matrices (of complex numbers). More specifically, suppose that 

(n) 

^ = l,...,r, are independent n x n random matrices, the entries in each of these matrices 
being also independent, and consider the mean values of the form 

E[tr(4")...4"))]. (48) 

Under some mild regularity assumptions, for any given index set ii,. . . ,ip and for n — > cx), the 
quantity (|48|l will have a limiting value, and the collection of such mean values corresponds to a 
random limiting object. Freeness expresses how the independence of x[^^\ . . . , Xr"-* is reflected in 
properties of that object. It is now possible to develop a theory of free infinite divisibility and free 
Levy processes that to a large extent parallels that of infinite divisibility and Levy processes in 
classical probability but also exhibit intriguing differences from the latter. There is, in particular, 
a one-to-one correspondence between the class of infinitely divisible laws in the classical sense and 
the class of the free infinitely divisible laws, with the 'free normal distribution' being the Wigner, 
or semicircle, law which has probability density 

7r-i(l-xV2)i/2. 

This law was first derived by Wigner in the 1950's as the limiting law of the distribution of 
eigenvalues of a random Hermitian matrix X*-"^ with independent, complex Gaussian entries. 
Wigner's motivation for studying the eigenvalue distribution was based on the supposition that 
the local statistical behaviour of the energy levels of a sufficiently complex physical s ystem is 
approxima tely siniulated by that of the eigenvalues of a random matrix (Hamiltonian), see lWignerl 
lll95a and lMehtal lll967l) . 

More detaile d su mmaries of the mathematical connec t ions i ndicated above are available in 
iBiand l|l998albfl and iBarndorff-Nielsen and ThorbiOrnsenI l|200lj) . Furthermore, there are deep 
connections betw een the theory of random matrices and that of longest increasi ng subsequences, see 
for ins tance iDeift (2000). We also wish to draw attention to a recent paper bv lBiane and Speiched 
l)200lh which introduces a concept of free Fisher information. 



General framework and continuous-time measurements The generic mathematical de- 
scription of the measurement process embodied in formula (|HJ) applies, in particular, to situa- 
tions where a quantum system is observed continuously over a time interval [0, T]. For each 
time point t € [0, T], a representation such as in ((HJ is available for the data as available 
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at that moment, but it is a highly n on-trivial task, carried out by iLoubenetl l)l999l l200(]|) . 
iBarndorfF-Nielsen and LoubenetsI l)200l(l to mesh these representations together in an interpretable 
and canonical fashion. For simplicity, consider the case when the index i in (jS)) takes only one value 
and hence can be omitted. Often the outcome of a measurement of this type can be considered as 
the realisation of a cadlag stochastic process = {xt : < t < T} on R and the evolution of this 
and of the quantum system are determined by a probability measure v on -D[0, T] and a collection 
of mappings W|(xo), Q < s < t < T from X — Z3[0,T] to B(7i), satisfying the normalisation 
relations 

/ wl[xirwl{xiy{dx\\xi) = I 

J_D[0,T] 

and the cocycle conditions 

Wl{xl) = W'M)Wl{xl). 

If the initial state of the quantum system in the Hilbert space 7i is a pure state ■00 then its 
evolutionary trajectory, conditional on cCq , is given by 

M4) = Wl{xl)^l^^. 

Under suitable further conditions, the evolutions of xt and tpt will be Markovian. 



9.2 Differential-geometric aspects 

In asymptotic parametric inference, differential geometry has proved to b e an a ppropriate lan- 
guage for expressing various key concepts, see iBarndorff-Nielsen and Coxl ((l994l Chaps. 5-7), 
iKass and VosI ( 199^ . Likewise, several concepts in quantum mechanics have differential-geometric 
interpretations. In particular, the quantum information I{9) of a parametric quantum model is a 
Riemannian metric on the parameter space 0, as is the Fisher information i(0;M) obtained by 
a measurement M. There are many other Riemannian metrics of im portance i n quantum the- 
ory. A characterisation of a large class of them is given in |Pet d l|l994) . See also lPetz and Sudlil 
l)l999l) . Any (complex) Riemannian metric on the space SA(H) of self-adjoint operators on a finite- 
dimensional Ti. (and satisfying some mild conditions) yields an inequality analogous to Helstrom's 
quantum Cramer-Rao inequality These inequa lities and results on g e ometr ies obtained from 
suitable real- valued functions on x are given in lAmarF and Nagaokal ll2000l Chap. 7). borne 
other differential-geometric aspects of quantum theory are considered in iBrodv and HughstonI 



9.3 Concluding Remarks 

This paper has sketched some main features of quantum statistical inference, and more generally, 
quantum stochastic modelling. The basic concepts for our paper coincide with the basic concepts 
of qu a ntum computation, quantum cryptography, quantum information theory, see lCruskal l)l999l 
l2nnil) . lNielsen and Chuanel (1200(1) . We hope that many statisticians will venture into these areas 
too, as we are convinced that probabilistic modelling and statistical thinking will play major roles 
there, and should not be left purely to computer scientists or theoretical physicists. 
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A Mathematics of Quantum Instruments 

Recall that an instrument A/" with outcomes x in the measurable space {X,A), is defined through 
a collection of observables A/'(v4)[y], for each A G A and each bounded self-adjoint Y. With 
7r(dx; /5, J\f) denoting the probability distribution of the outcome of the measurement, and a{x; p, Af) 
denoting the posterior state when the prior state is p and the outcome of the measurement is x, 
we have 



Thus if one 'measures the instrument' on the state p, registers whether or not the outcome is in 
A, and subsequently measures the observable Y, the expected value of the outcome so obtained 
equals the expected value of the outcome of measuring directly the osbervable A/'(A)[y]. 

A.l Complete Positivity 

The observables A/'( A) [y] are sigma-additive in A, linear in Y, nonnegative in Y (map non-negative 
operators to non- negative operators), and normalised by A/'(A')[1] = 1. Any collection satisfying 
these constraints is called a positive instrument. Now given a positive instrument J\f defined 
on a Hilbert space 7i, we can extend the instrument to the tensor product of this space with 
another Hilbert space K. by defining Af{A)[Y Z] = M{A)[Y] (g) Z. This corresponds intuitively 
to measuring Af on the first component of a quantum system in the product space, leaving the 
second component untouched. By linearity, once the extended instrument is defined on product 
observables like F (8) it is defined on all observables of the product system. An instrument Af 
is called completely positive if and only if every such extension (i.e., for any auxiliary system /C) 
remains positive. It turns out that one need only verify the positivity of the extensions for /C of 
dimension 2,3,..., dim(7i) -I- 1. 

Here is a classic example of an instrument which is positive, but not completely positive, hence 
is not physically realisable. 

Example 14 (A positive, but not completely positive, instrument). Let the outcome space 
be trivial (consisting of a single element) so the instrument only transforms the incoming state, and 
does not generate any data. We therefore just specify an observable Af[Y] for each observable Y: we 
define it by A/'[F] — Y^ , the transpose of the observable Y. This corresponds to the outcome state 
a{p;Af) = p^ . Now take K, = H, of finite dimension d, and define — i \i) (g) \i) where the 
vectors \i) form an orthonormal basis of 7i, take p = \tp){ip\. Let a = p^ denote the corresponding 
output state. As a matrix operating on vectors, a'(^-Ci|i) J2j — (Si'^iK) ^ J2j^j\j))- 

Thus in particular, a maps \i) \j) — \j) (g) \i) to minus itself. Hence it has negative eigenvalues, 
and therefore cannot be a density matrix. □ 

Any dominated measurement M can be embedded into an instrument. The simplest way is by 
taking the posterior states to be m{x)^ pm{x)2 /tr{p'm{x)) for each outcome x having a positive 
density tT{pm{x)) with respect to the same measure i' which dominates M. This corresponds to 
there being only one index i in JSJ, and W{x) = ra{x)^ . 

The next example illustrates the need to allow unbounded operators Wi{x) in 0, even if the 
completely positive instrument in question is bounded. 

Example 15 (Position measurement). As in Section ITTI take as Hilbert space Ti = Lx:{Sf) 
and consider the PProM corresponding to the position observable Q. Thus the operator Q simply 
multiplies an function of x by the identity function x ^ x. The PProM has elements M{B), for 
each Borel subset B of the real line, equal to the operator which multiplies an function by 1_b, 
the indicator function of the set B. In other words, M{B) projects onto the subspace of functions 
which are zero outside B. The intuitively natural way to consider this measurement as part of an 
instrument would be to take the posterior state, given that the outcome is a: G R, to be a delta- 
function at the point x. This is not an element of 7i. However, one can easily imagine the following 
instrument N: measure Q, and replace the quantum system by a new particle in the fixed state poi 
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independently of the outcome x. (We reconsider the original instrument, later). By the physical 
interpretation of N{B)\Y], we must have, for any state p, that iY{pN{B)\Y]) = iY{pl b)^^{pqY) . 
Suppose po is the pure state with state vector |'0o)- Then informally, in (|SJ, one should have a 
single index i, dominating measure v equal to Lebesgue measure, and W{x) = |a;)(?/'o| where the 
I a;) stands for the delta-function at x, thus is not a particular member of 7i, but is defined through 
the formula = ^^{x). Thus W{x) is the operator defined on the subspace of continuous 

functions %l) by W{x)'ijj = 'ijj{x)'ijjQ. It cannot be extended in a continuous way to all of L^, 
and is therefore an unbounded operator. The instrument M can be written as A/'(da;)[y] = 
iV'o) (V'ol (2;|y|a;) dx, or M{B)\Y] — |'(/;o)('(/'o|(1s|^|1_b), which is defined for all bounded operators 
Y and arbitrary Borel sets B. 

Reconsider the instrument A/"' defined formally by W{x) = |a;)(x|. Formally, we should have 
M'{dx)[Y] = |a;)(a;|(a;|y|a;)da; and thus N'{B)[Y] = |a;)(a;|(x|y|a;)da;. This formula is supposed 
to represent an observable, i.e., a possibly unbounded operator on TL. To find out what it does, 
we manipulate with delta-functions to find {B)\Y]\^) — l^^'^d/ii' where p,Y is the finite 

measure on the real line defined by = (l^|y|l^). Note that /iy is absolutely continuous 

with repect to Lebesgue measure v. Thus M'{B)\Y] is defined on the subspace of functions, 
square integrable on B with respect to py, and on that subspace it acts by multiplying by the 
function 1b ■ d/iy/d^'- The instrument A/"' is unbounded. It has an informal representation JH)) 
involving objects W which cannot even be considered as unbounded operators, and there does 
not exist a posterior state for each outcome x of the instrument. There is a well-defined posterior 
state given the outcome lies in a set B of positive probabihty n{B;p) = tr(plB). It is formally 
defined by (t{B;p) = Jg \x){x\'n{dx\B; p). □ 

A. 2 Projection and Dilation of Measurements 

Let n : 7i' — > be the orthogonal projection of a Hilbert space H,' onto a subspace 7Y. Then 11 
induces a map 

n* : OProM(A',H') OProM(A',H) 

by 

(n*(M))(A) = nM(A)n* a^^a. (49) 

In the physical literature, the OProM M is said to be a dilation or extension of 11* (M). 

The following theorem shows that every OProM can be obtained from some PProM by the 
above construction: every generalised measurement can be dilated to a simple measurement. 

Theorem 7 (|Naimarklll940h . Given M in OProM(A', 7i), there is (i) a Hilbert space Ti' con- 
taining Ti, (ii) a projection-valued probability measure M' in PProM(A', 7i'), such that 

n* (M') = M 

(in the sense of ^^), where IT ; 7i' — 7i is the orthogonal projection. 

The theorem of Naimark shows how to extend a generalised measurement to a simple mea- 
surement on a larger space. There is also an obvious way to consider a state on the smaller space 
as a state on the larger space, concentrating on the subspace. These two extensions together do 
not have the same statistical behaviour as the original pair of state and measurement. Adapting 
the proof of Naimark's theorem one can show how to extend an arbitrary state on the smaller 
space to a state on a larger space, in a way which matches the extension of the measurement, and 
together reproduces the statistics of the original set-up. This is taken care of by Holevo's theorem. 
Theorem n] at the end of subsection 12. 21 

B The Braunstein— Caves Argument 

A measurement M with density m with respect to a sigma-finite measure v is given. Its outcome 
has density p{x;9) = tT{p{9)m{x)} with respect to ly. In the argument below, 9 is also fixed. 
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Define X+ — {x : p{x;0) > 0} and Xq = {x : p{x;9) = 0}. Define A = A{x) — m{x)^ p /i g p^^ , 
B = B{x) = m{x)ipi, and z = tr{A*B}. Note that p{x; 9) = tr{B*B}. 

The proof of (|31|l given below consists of three inequality steps. The first will be an application 
of the trivial inequality 5R(z)^ < |zp with equality if and only if = 0. The second will be 
an application of the Cauchy-Schwarz inequality |tr{A*i?}p < tr{A* A}tT{B* B} with equality if 
and only if A and B are linearly dependent over the complex numbers. The last step consists of 
replacing an integral of a nonnegative function over by an integral over X. Here they are: 

i{9;M) = [ p{x;9y^{ntT{pp//gm{x))^i^{dx) 
< [ p{x;9y^\tT{pp//gm{x))\^iy{dx) 

{tr{pm{x)))~^ i/{dx) 



■+ 




tl{m{x)p//gpp//g)v{dx) 



tT{m{x)p//gpp//g)u{dx) 

= (50) 

The necessary and sufiicient conditions for equality at each of the three steps are therefore: 

3(tr{A(a;)*B(x)}) = 0, 
a{x)A{x) + l3(x)B(x) = 0, 

/ Xv{A{x)* A{x)}v{dx) = 0, 

where a{x) and I3{x) are arbitrary complex numbers, not both equal to zero, and the first two 
equalities are supposed to hold i^-almost everywhere where 9) is positive, while in the third 
equality Aq is precisely the set where p{x] 9) is zero. 

Now if A{x) = r{x)B{x) for real r{x), for v almost all x, then A*B = rB*B and its trace is real. 
Hence the first and second conditions are satisfied. Moreover, we then also have ti{A{x)A* (x)} = 
r{x)'^p{x; 9) so the third condition is also satisfied. 

Conversely, suppose all three conditions are satisfied. Since p{x\9) = tr{i?(x)*_B(a;)}, on 
X-\. we must have B non-zero and hence a non-zero. So (still on X+) A on B and the first 
condition implies that the proportionality constant must be real. The third condition implies that 
tr{A{x)A{x)*} and hence A{x) is almost everywhere zero where p{x;9) — ti-{B{x)* B{x)} = 0, 
i.e., where B{x) = 0. So certainly one may write A{x) = r{x)B{x) for some real r{x) there, too. 

In Braunstein and Caves' somewhat sketchy proof, it seems to be assumed that p{x; 9) is 
everywhere positive, hence only two inequality steps are involved. We note that the main ingredient 
of these proofs is the Cauchy-Schwarz inequality. This is also the main step in proving Helstrom's 
quantum Cramer-Rao bound, and of course in proving the classical Cramer-Rao bound. 



tr 



m(x)2p2)* (m(x) 




References 

Accardi, L., S. Kozyrev, and I. Volovich (1997). Dynamics of dissipative two-level systems in the 
stochastic approximation. Phys. Rev. A 56, 2557-2562. 

Accardi, L. and M. Regoli (2000a). Locality and Bell's inequality. Preprint, Volterra Institute, 
University of Rome II. quant -ph/0007005. 

Accardi, L. and M. Regoli (2000b). Non-locality and quantum theory: new experimental evidence. 
Preprint, Volterra Institute, University of Rome II. quant -ph/0007019. 



49 



Albeverio, S., B. Riidiger, and J.-L. Wu (2001). Analytic and probabilistic aspects of Levy pro- 
cesses and fields in quantum theory. In O. E. Barndorff-Nielsen, T. Mikosch, and S. Resnick 
(Eds.), Levy Processes — Theory and Applications, Boston. Birkhauser. 

Amari, S, I. and H. Nagaoka (2000). Methods of Information Geometry. Oxford: Oxford University 
Press. 

Aspect, A., J. Dalibard, and G. Roger (1982a). Experimental realization of Einstein-Podolsky- 
Rosen-Bohm Gedankenexperiment: a new violation of Bell's inequalities. Phys. Rev. Letters AQ, 
91-94. 

Aspect, A., J. Dalibard, and G. Roger (1982b). Experimental test of Bell's inequalities using 
time-varying analysers. Phys. Rev. Letters 49, 1804-1807. 

Banaszek, K., G. D'Ariano, M. Paris, and M. Sacchi (2000). Maximum-likelihood estimation of 
the density matrix. Phys. Rev. A 61, 010304(R). 

Barchielli, A. and A. M. Paganoni (1996). A note on a formula of the Levy-Khinchin type in 
quantum probability. Nagoya Math. J. 141, 29-43. 

Bardou, F., J. Bouchaud, A. Aspect, and C. Cohen- Tannoudji (2001). Non-ergodic Cooling: 
Suhrecoil Laser Cooling and Levy Statistics. Cambridge: Cambridge University Press. To 
appear. 

Barndorff-Nielsen, O. and F. Benth (2001). Laser cooling and stochastics. In M. C. M. de Gunst, 
C. A. J. Klaassen, and A. W. van der Vaart (Eds.), State of the Art in Probability and Statistics, 
Festschrift for W.R. van Zwet, Lecture Notes-Monograph Series 36, Hayward, Ca., pp. 50-71. 
Institute of Mathematical Statistics. 

Barndorff-Nielsen, O., F. Benth, and J. L. Jensen (2000a). Light, atoms, and singularities. Research 
Report 2000-19, MaPhySto, University of Aarhus. (Submitted). 

Barndorff-Nielsen, O., F. Benth, and J. L. Jensen (2000b). Markov jump processes with a singu- 
larity. Ann. Appl. Prob 32, 779-799. 

Barndorff-Nielsen, O. and E. Loubenets (2001). General framework for the behaviour of continu- 
ously observed open systems. Research Report 2001-??, MaPhySto, University of Aarhus. 

Barndorff-Nielsen, O. and S. Thorbj0rnsen (2001). Selfdecomposability and Levy processes in free 

probability. Bernoulli. To appear. 

Barndorff-Nielsen, O. E., P. Blaesild, J. L. Jensen, and B. J0rgensen (1982). Exponential trans- 
formation models. Proc. Roy. Soc. London Ser. A 379, 41-65. 

Barndorff-Nielsen, O. E. and D. R. Cox (1994). Inference and Asymptotics. London: Chapman 
and Hall. 

Barndorff-Nielsen, O. E. and R. D. Gill (2000). Fisher information in quantum statistics. J. Phys. 
A.: Math. Gen. 33, 4481-4490. 

Barndorff-Nielsen, O. E., R. D. Gill, and P. E. Jupp (2001). Quantum Information. In B. Engquist 
and W. Schmid (Eds.), Mathematics Unlimited — 2001 and Beyond (Part I), Heidelberg, pp. 83- 
107. Springer. 

Barndorff-Nielsen, O. E., R. D. Gill, and P. E. Jupp (2002). Quantum Stochastics. In Preparation. 

Barndorff-Nielsen, O. E. and A. E. Koudou (1995). Cuts in natural exponential families. Teor. 
Veroyatnost. i Primenen. 2, 361-372. 



50 



Belavkin, V. P. (1976). Generalized Heisenberg uncertainty relations, and efficient measurements 
in quantum systems. Theoret. and Math. Phys. 26, 213-222. 

Belavkin, V. P. (1994). Quantum diffusion, measurement and filtering I. Theory Probab. Appl. 38, 
573 585. 

Belavkin, V. P. (2000). Quantum probabilities and paradoxes of the quantum century. Infinite 
Dimensional Analysis, Quantum Probability and Related Topics 3, 577-610. 

Belavkin, V. P. (2001). Quantum noise, bits, jumps: uncertainties, decoherence, measurements 
and filterings. Progress in Quantum Electronics 25, 1-53. 

Bell, J. S. (1964). On the Einstein Podolsky Rosen paradox. Physics 1, 195-200. 

Bennett, C, G. Brassard, G. Crepeau, G. Jozsa, A. Peres, and W. Wootters (1993). Teleporting 
an unknown quantum state via dual classic and Einstein-Podolsky-Rosen channels. Phys. Rev. 

Lett. 70, 1895 1899. 

Bennett, C. H., D. P. DiVincenzo, G. A. Fuchs, T. Mor, E. Rains, P. W. Shor, J. A. Smolin, 
and W. K. Wootters (1999a). Quantum nonlocality without entanglement. Phys. Review A 59, 
1070-1091. 

Bennett, G. H., P. W. Shor, J. A. Smolin, and A. Thapliyal (1999b). Entanglement-assisted 
classical capacity of noisy quantum channels. Phys. Rev. Lett. 83, 3081-3084. 

Bennett, G. H., P. W. Shor, .J. A. Smolin, and A. Thapliyal (2001). Entanglement-assisted ca- 
pacity of a quantum channel and the reverse shannon theorem. Technical report, AT&T Labs. 
quant-ph/0106052. 

Biane, P. (1995). Galcul stochastique non-commutatif. In P. Bernard (Ed.), Lectures on Probability 
Theory. Ecole dEte de Probabilites de Saint-Flour XXIII - 1993, Lecture Notes in Mathematics 
1608, Heidelberg, pp. 1-96. Springer- Verlag. 

Biane, P. (1998a). Free probability for probabilists. Preprint 40, MSRI. 

Biane, P. (1998b). Processes with free increments. Math. Zeitschrift 227, 143-174. 

Biane, P. and R. Spcichcr (2001). Free diffusions, free entropy and free Fisher information. Ann. 
Inst. H. Poincare. To appear. 

Bouwmeester, D., J.-W. Pan, K. Mattle, M. Eibl, H. Weinfurter, and A. Zeilinger (1997). Exper- 
imental quantum teleportation. Nature 390, 575-579. 

Bouwmeester, D., J.-W. Pan, H. Weinfurter, and A. Zeilinger (2001). High-fidelity teleportation 
of independent qubits. J. Modern Optics, to appear; preprint quant-ph/9910043. 

Brandt, S. and H. D. Dahmen (1995). The Picture Book of Quantum Mechanics. Heidelberg: 
Springer- Verlag. 

Braunstein, S. L. and G. M. Gavcs (1994). Statistical distance and the geometry of quantum 
states. Phys. Review Letters 72, 3439-3443. 

Brody, D. G. and L. P. Hughston (2001). The Geometry of Statistical Physics. London/Singapore: 
Imperial GoUcgc Press/World Scientific. To appear. 

Ghristensen, B. J. and N. M. Kiefer (1994). Local cuts and separate inference. Scand. J. Statis- 
tics 21, 389-401. 

Ghristensen, B. J. and N. M. Kiefer (2000). Panel data, local cuts and ortliogeodesic models. 
Bernoulli 6, 667-678. 



51 



Cox, D. R. and D. V. Hinkley (1974). Theoretical Statistics. London: Chapman and Hall. 

D'Ariano, G. M. (1997a). Quantum estimation theory and optical detection. In T. Hakioglu 
and A. S. Shumovsky (Eds.), Quantum Optics and the Spectroscopy of Solids, Amsterdam, pp. 
135-174. Kluwcr. 

D'Ariano, G. M. (1997b). Measuring quantum states. In T. Hakioglu and A. S. Shumovsky (Eds.), 
Quantum Optics and the Spectroscopy of Solids, Amsterdam, pp. 175-202. Kluwer. 

Davies, E. B. (1976). Quantum Theory of Open Systems. London: Academic Press. 

Davies, E. B. and J. T. Lewis (1970). An operational approach to quantum probability. Comm. 
Math. Phys. 17, 239-260. 

Deift, P. (2000). Integrable systems and combinatorial theory. Notices AMS 47, 631-640. 

Dieks, D. (1982). Communication by opr devices. Phys. Lett. A 92, 271-272. 

Feynman, R. P. (1951). The concept of probability in quantum mechanics. In Proc. II Berkeley 
Symp. Math. Stat, and Prob., Berkeley, pp. 533-541. Univ. Calif. Press. 

Pujiwara, A. and H. Nagaoka (1995). Quantum Fisher metric and estimation for pure state models. 
Phys. Lett. A 201, 119-124. 

Gardiner, C. and P. ZoUer (2000). Quantum Noise. Berlin: Springer- Verlag. 2nd edition. 

Gill, R. D. (2001a). Asymptotics in quantum statistics. In M. C. M. de Gunst, C. A. J. Klaassen, 
and A. W. van der Vaart (Eds.), State of the Art in Probability and Statistics, Festschrift for 
W.R. van Zwet, Lecture Notes-Monograph series 36, Hayward, Ca., pp. 255-285. Institute of 
Mathematical Statistics. 

Gill, R. D. (2001b). Teleportation into quantum statistics. J. Korean Statist. Soc. in press. 

Gill, R. D. and B. Y. Levit (1995). Applications of the van Trees inequality: a Bayesian Cramer- 
Rao bound. Bernoulli 1, 59-79. 

Gill, R. D. and S. Massar (2000). State estimation for large ensembles. Phys. Review A 61, 
2312-2327. 

Gilmore, R. (1994). Alice in Quantum Land. Wilmslow: Sigma Press. 

Green, H. S. (2000). Information Theory and Quantum Physics. Physical Processing for Under- 
standing the Conscious Process. Berlin: Springer. 

Gruska, J. (1999). Quantum Computation. McGraw-Hill. 

Gruska, ,J. (2001). Quantum Computing Challenges. In B. Engquist and W. Schmid (Eds.), 
Mathematics Unlimited 2001 and Beyond (Part I), Heidelberg, pp. 529 563. Springer. 

Hayashi, M. amd Matsumoto, K. (1998). Statistical model with a option for measurements and 
quantum mechanics. RIMS koukyuroku 1055, 96-110. 

Hayashi, M. (1997). A linear programming approach to attainable Cramer-Rao type bounds. In 
A. Hirota, A. Holevo, and C. Caves (Eds.), Quantum Comunication, Computing and Measure- 
ment, New York, pp. 99-108. Plenum. 

Helstrom, C. W. (1976). Quantum Detection and Information Theory. New York: Academic 
Press. 

Holevo, A. S. (1982). Probabilistic and Statistical Aspects of Quantum Theory. Amsterdam: 
North-Holland. 



52 



Holevo, A. S. (2001a). Levy processes and continuous quantum measurements. In O. E. Barndorff- 
Nielsen, T. Mikosch, and S. Resnick (Eds.), Levy Processes — Theory and Applications, Boston. 
Birkhauser. 

Holevo, A. S. (2001b). On entanglement-assisted classical capacity. Preprint, Math. Inst., Russ. 
Acad. Sci. quant-ph/0106075. 

Holevo, A. S. (2001c). Statistical Structure of Quantum Theory. Lecture Notes in Physics m67. 
Heidelberg: Springer- Verlag. 

Isham, C. (1995). Quantum Theory. Singapore: World Scientific. 

Kass, R. E. and P. W. Vos (1997). Geometrical Foundations of Asymptotic Inference. New York: 
Wiley. 

Keyl, M. and R. Werner (2001). Estimating the spectrum of a density operator. Preprint, Inst. 
Math. Physik, T.U. Braunschweig, quant-ph/0102027. 

Kraus, K. (1983). States, Effects and Operations: Fundamental Notions of Quantum Theory. 
Lecture Notes in Physics 190. Berlin: Springer- Verlag. 

Leonhardt, U. (1997). Measuring the Quantum State of Light. Cambridge: Cambridge University 
Press. 

Lindblad, G. (1976). On the generators of quantum dynamical semigroups. Comm. Math. Phys. 48, 
119-130. 

Loubenets, E. (1999). The quantum stochastic evolution of an open system under continuous in 
time nondcmolition measurement. Research Report 1999-45, MaPhySto, University of Aarhus. 

Loubenets, E. (2000). Quantum stochastic approach to the description of quantum measurements. 
J. Phys. A., to appear; Research Report 2000-39, MaPhySto, University of Aarhus. 

MaUey, J. D. and J. Hornstein (1993). Quantum statistical inference. Statistical Science 8, 433- 
457. 

Massar, S. and S. Popescu (1995). Optimal extraction of information from finite quantum ensem- 
bles. Phys. Rev. Lett. 74, 1259-1263. 

Maudlin, T. (1994). Quantum Non-locality and Relativity. Oxford: Blackwell. 

Mehta, M. (1967). Random Matrices and the Statistical Theory of Energy Levels. New York: 
Academic Press. 

Meyer, P.-A. (1993). Quantum Probability for Probabilists. Lecture Notes in Mathematics 1538. 
Berlin: Springer- Verlag. 

M0lmer, K. and Y. Castin (1996). Monte Carlo wavefunctions. Coherence and Quantum Optics 7, 
193-202. 

Mooij, J., T. Orlando, L. Levitov, L. Tian, C. van der Wal, and S. Lloyd (1999). Josephson 
persistent-current qubit. Science 285, 1036-1039. 

Naimark, M. A. (1940). Spectral functions of a symmetric operator, [in Russian with an English 
summary]. Izv. Akad. Nauk SSSR, Ser. Mat. 4, 277-318. 

Nielsen, M. and I. Chuang (2000). Quantum Computation and Quantum Information. New York: 

Cambridge University Press. 

Ogawa, T. and H. Nagaoka (2000). Strong converse and Stein's lemma in quantum hypothesis 
testing. IEEE Trans. Inf Theory 46, 2428-2433. 



53 



Ozawa, M. (1985). Conditional probability and a posteriori states in quantum mechanics. Publ. 

RIMS Kyoto Umv. 21, 279-295. 

Paris, M., G. D'Ariano, and M. Sacchi (2001). Maximnm-likclihood method in quantum estima- 
tion. Preprint, Dip. 'A. Volta', Univ. Pavia. quant-pli/0101071. 

Parthasarathy, K. (1992). An Introduction to Quantum Stochastic Calculus. Basel: Birkhauser. 

Parthasarathy, K. (1999). Extremal decision rules in quantum hypothesis testing. Infinite Dimen- 
sional Analysis, Quantum Probability and Related Topics 2, 557-568. 

Peres, A. (1995). Quantum, Theory: Concepts and Methods. Dordreeht: Kluwer. 

Peres, A. and W. K. Wootters (1991). Optimal detection of quantum information. Phys. Rev. 
Lett. 66, 1119-1122. 

Petz, D. (1994). Monotone metrics on matrix spaces. Linear Algebra and its Applications 244, 
81-96. 

Petz, D. and C. Sudar (1999). Extending the Fisher metric toi density matrices. In O. E. Barndorff- 
Nielsen and E. V. Jensen (Eds.), Geometry in Present Day Science, Singapore, pp. 21-33. World 
Scientific. 

Smithey, D., M. Beck, M. Raymer, and A. Faridani (1993). Measurement of the Wigner distribu- 
tion and the density-matrix of a light mode using optical homodyne tomography — application 
to squeezed states and the vacuum. Phys. Rev. Lett. 70, 1244-1247. 

Stinespring, W. F. (1955). Positive functions on C*-algebras. Proc. Amer. Math. Sac. 6, 211-216. 

Vogel, K. and H. Risken (1989). Determination of quasiprobability distributions in terms of 
probability- distributions for the rotated quadrature phase. Phys. Rev. A 40, 2847-2849. 

Wigner, E. P. (1958). On the distribution of the roots of certain symmetric matrices. Ann. 

Math. 67, 325-327. 

Wiseman, H. M. (1996). Quantum trajectories and quantum measurement theory. Quantum 
Semiclass. Opt. 8, 205-222. 

Wiseman, H. M. (1999). Adaptive quantum measurements (summary). In Miniproceedings: Work- 
shop on Stochastics and Quantum Physics, Miscellanea, 14, University of Aarhus. MaPhySto. 

Wootters, W. and W. Zurek (1982). A single quantum cannot be cloned. Nature 299, 802-803. 

Young, T. Y. (1975). Asymptotically efficient approaches to quantum- mechanical parameter esti- 
mation. Information Sciences 9, 25-42. 



54 



